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The invention relates to new expression systems and in particular to an expression system in which a gene of interest is expressed 
at an optimal level. The invention provides a recombinant expression vector comprising a gene of interest and a selectable marker gene, 
wherein the selectable marker gene is ananged downstream of the gene of interest and a stop codon associated wilii the gene of interest is 
spaced from a start codon of said selectable marker gene at a distance which is sufficient to ensure that translation reinitiation is requnred 
before said selectable marker protein is expressed from the conesponding MRNA. Examples of such expression systems are vector viral 
packaging cell lines and a number of preferred cell lines have been identified. 
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Expression systems 

The present invention relates to new expressions systems, 
5 and in particular to expression systems in which a gene of 

interest is expressed at an optimal level. Particular 
■ examples of such expression systems are retroviral packaging 
cell lines and a number of preferred cell lines have been 
identified. 

10 

The ability of eukaryotic and prokaryotic ribosomes to 
reinitiate translation at an internal start codon within an 
mRUA sequence has previously been recognised. Studies have 
been reported in which the efficiency of the process which 

15 is generally regarded as being low, has been connected with 

the length of the intercistronic sequence (Kozak (1987) 
Mol. Cell Biol. 7, 3438-3445). Selection of this sequence 
or spacer as 70bp in length, and containing no other start 
codons, has been previously reported as being optimal for 

20 reinitiation in a eukaryotic cell line (Cosset P-L. , 

Virology (1991) 185, 862) . 

The applicants have found a way in which the inefficiency 
associated with the translation reinitiation process can be 
25 used to good effect. 

According to the present invention there is provided a 
recombinant expression vector comprising a gene of interest 
and a selectable marker gene, wherein the selectable marker 

3 0 gene is arranged downstream of the gene of interest and a 

stop codon associated with the gene of interest is spaced 
from a start codon of said selectable marker gene at a 
distance which is sufficient to ensure that translation re- 
initiation is required before said selectable marker protein 

3 5 is expressed from the corresponding mRNA; 
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The invention further provides a process for producing cell 
lines in which a gene of interest is expressed, which 
process comprises transforming host cells with an expression, 
vector comprising said gene of interest and a selectable 
marker gene, wherein the selectable marker gene is arranged 
downstream of the gene of interest and a stop codon 
associated with the gene of interest is spaced from a start 
codon of said selectable marker gene at a distance which 'is " 
sufficient to ensure, that translation re-initiation is ^ 
required before said selectable marker protein is expressed 
from the corresponding mRNA, and selecting those cells where 
expression of the selectable marker gene may be detected. 

Since, re- initiation "of translation is ^a . relatively- 
inefficient process, this means that the selectable marker 
protein will be expressed at lower levels than the product 
of the gene of interest... When the marker protein is 
expressed at detectable levels, the gene of interest will be 
expressed at higher levels. This will ensure that during 
the subsequent selection procedure, only those cell clones 
which express the gene of interest at higher or optimal 
levels will survivis. Low expressing clones will be 
eliminated by the selection process. 

Cells transformed .with the above -described expression 
vectors form a further aspect of the invention. 

The host cells are suitably eukaryotic or prokaryotic host 
cells, preferably eukaryotic host cells. 

The number of nucleotides in the space between the stop 
codon of the gene of interest and the start codon of the 
selectable marker will suitably be in the range of from 20- 
200 nucleotides, preferably from 60-80 nucleotides, even 
more preferably 70-80 nucleotides. 
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The vectors used in the process of the invention may be any 
of the known types, for example expression plasmids or viral 
vectors . 

Selected cells may be cultured and if required, the protein 
product of the gene of interest isolated from the culture 
using conventional techniques. Alternatively, expression of 
the gene of interest may result in other desired effects, 
for example, where the gene of interest is included as part 
of a viral packaging construct.. 

Some experimental and clinical gene transfer protocols 
require the design of gene transfer vectors suitable for in 
vivo gene delivery (Miller, A-.D. 1992. Nature 357:455- 
460). Retroviral vectors are attractive candidates for 
such applications, because they can provide stable gene 
transfer and expression (Samarut J. et al.,Meth. Enzymol . in 
press) and because packaging cells have been designed which 
produce non- replication competent viruses (Miller A.D (1990) 
Hum Gene Ther. 1 5-14). However currently available ' 
recombinant retroviruses suffer from a number of drawbacks. 

Packaging cell lines provide' in trans the retroviral 
proteins encoded by the gag , pol , and env genes required to 
obtain infectious retroviral particles. The gag and pol 
products "are respectively the structural components of the 
virion cores and the replication machinery (enzymes) of the 
retroviral particles whereas the env products are envelope 
proteins responsible for the host -range of the virions and 
for the initiation of infection and for sensitivity to 
humoral factors. An ideal packaging ' cell line should 
produce retroviruses that only contain the retroviral vector 
genome,; and absolutely no replication-competent genomes or 
defective genomes encoding some of the viral structural 
genes . 
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A number of packaging ceil lines designed for human gene 
transfer have been designed in the past by introducing 
plasmid DNAs which contain "helper genomes" encoding gag, 
pol and/or env genes into cells. v . 

Recroviral packaging cell lines are cells that have been 
engineered to provide in trans all the functions required to 
express infectious retroviral vectors. A helper genome (or 
construct or unit) , is herein also referred to as 
"retroviral packaging construct (or unit)."- or "packaging- 
deficient construct (or genome unit)." or "g^g-pol/env 
expression plasmids" ... 

Much efforts has been made to design strategies to optimize 
the helper-genomes in order (i) to get the highest 
production of retroviral packaging functions (which 
correlates which infection titers of retroviral particles) 
and (ii) to minimise the chance that the helper genome can 
be transmitted via the viral particles (which may lead to 
emergence of unwanted retroviral forms) . 

The first of these packaging cell lines used full length 
retroviral genomes' as helper genomes that had been crippled 
for ' important cis-regulated replicative fianctions (reviewed 
in Miller, Hum. Gene. Ther. 1:5-14 1990). In order to reduce 
the possibility of occurrence of replication-competent 
viruses and of transfer of virus structural genes, a second 
generation of. safer packaging cell lines has been designed 
by using two separate and complementary helper genomes which 
express either gag-pol or env and are packaging-deficient 
(Miller supra) . 

The cells into which these helper genomes were introduced 
were isolated by cotransf ecting them with plasmids encoding 
selectable markers. However, as no selection was applied on 
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the packaging-deficient retroviral genome itself, the" helper 
functions can be lost during the passages of the cells in 
culture and the current packaging systems provide limited 
titers of infectious retroviral vectors, usually only of the 
order of lo^-io^ infectious units i.u/ml.. Indeed the 
cotransfection with a -plasmid encoding a selectable marker 
does not directly select the best . gag-pol*env-expressing 
cells. 

The invention further provides. a retroviral packaging cell 
line comprising a host cell transformed with (i) a packaging 
deficient construct which expresses a viral gag-pol gene and 
a first selectable marker gene, and/or (ii) a packaging - 
deficient construct which expresses a viral env gene and a 
second selectable marker gene; wherein a start codon of the 
first and second selectable markers are spaced from the stop 
codons of the viral gag-pol gene and the viral env gene 
respectively by a distance which ensures that reinitiation 
of mRNA translation is required for expression of marker 
protein product of said first arid/or second selectable 
marker gene. 

The retroviral packaging cell line may be obtained by the 
above described process which will involve selecting 
transfected cells which express said first and/or second 
marker genes. 

By using helper constructs " which are directly selectable and 
which provide for high expression of the viral gene, .high 
titre retroviral vectors may be obtained. ■ . 

Helper .constructs for use in the process form a furtner 
aspect of the invention. 



The retroviral vectors prepared from the conventional 
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packaging cell lines are usually not contaminated by ' 
replication-competent retroviruses (RCRs) . However,* 
recombinant amphotropic murine retroviruses have been shown 
to arise spontaneously from certain packaging cell lines. 
The generation of such RCRs involves recombination at least 
between gag-pol/env packaging sequence and vector sequences 
(Cosset et al . , Virology, (1993) 193:385-395). ~ 

Recombinant RCRs have been associated with the development 
of lymphomas in some severely immunosuppressed monkeys 
(Donahue et al., J. Exp Med (1992) 176: 1125-1135). In 
addition, retroviral vector preparations .may also contain,, 
.at low frequencies, retroviruses coding for functional 
envelope glycoproteins (Kozak and Kabat, 1990, J. Virol. 64: 
3500-3508) or for gag-pol proteins. Although the 
pathogenicity of these gag-pol or env recombinant 
retroviruses- is probably low, more evolved recombinant 
retroviruses with higher pathogenic potential may occur when 
injected in vivo, by recombination and/or complementation of 
the initial recombinant viruses with some endogenous 
retroviruses. 

In a preferred embodiment of the retroviral packaging cell 
lines of the invention, the overlapping sequences between 
the genomes of the retroviral vector and the helper 
construct are reduced, for -example as compared to constructs 
such as CRIPenv and CRIPAMgag (Danos et al., Proc. Natl. 
Acad. Sci USA 85: 6460-6464). In particular, the viral 
sequences in the helper construct are reduced, for example, 
not only the packaging sequence but also the 3' Long 
Terminal Repeat (LTR) , the 3' non-coding sequence and/or the 
5'LTR may be eliminated. 

The' possibility of generation of such RCRs and recombinant 
retroviruses can be reduced by reducing the overlapping 
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sequences between the genomes of- both the retroviral vector 
and the helper construct . 

Conventional retroviral vectors are strongly inactivated by 
5 human serum which makes them of limited or no use for in 

situ gene transfer In gene therapy applications. It has 
previously been shown that inactivation by complement in 
human serum is controlled by the cell line used to produce 
the virions and by viral envelope determinants (Takeuchi et 

10 al., J. Virol (1994) 68:8001-8007). ' In particular, 

inactivation is caused by some properties of the cell lines 
that have been used to construct the packaging cells (NIH- 
3T3) and also by viral determinants located in the 
retroviral envelope as shown (Takeuchi et al . , J. Virol 

15 (1994) 68:8001-8007). In vivo a ene delivery is an important 

goal for a number of human gene therapy strategies. 

The applicants have foimd that certain cell lines form 
preferred packaging cell lines. 



20 



25 



Particularly preferred packaging cell lines are the HT1080 . 
line, the TE671 line, the 3T3 line, the 293 line and the Mv- 
1-Lu line. One example of retroviral packaging cells that 
will produce complement-resistant virus comprise human 
HT1080 cells and express RD114 envelope. Such, cells form a 
preferred aspect of the invention. 



Packaging cell lines according to the invention provide 5 0- 
100 fold increased titers of retroviral vectors as compared 
30 to conventional packaging cell lines. Retroviral vectors 

provided by these new cells are safe, in terms of . generation 
of RCRs, and considerably more resistant to inactivation by 
human complement . 



35 



Packaging cell lines according to the invention may be able 
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to transduce helper-free, human complement-resistant" 
retroviral vectors at titers consistently higher than 10^ 
i.u./ml. 

suitable semi-packaging cell lines in accordance with the 
invention are those which express only the gag-pol genes. 
Such cell lines may suitably be derived from TE671, MINK Mv- 
1-Lu, HT1080, 293 or NIH-3T3 cells by introduction of 
plasmid CeB (the MoMLV, gag-pol expression unit). 

•Particularly preferred expression vectors in accordance with 
the invention for use in retroviral packaging cell lines are 
those which include MLV gag and pol genes sucH as CeB. 
Other plasmids may include gag and pol genes from other 
retroviruses or chimeric or mutated gag and pol genes. 

Various viral and retroviral envelope genes may be included 
in the plasmids such as ^^LV-A envelope, GALV envelope, VSV-G 
protein, BaEV envelope, RD114 envelope and chimeric or 
mutated envelopes. Plasmids which include the RD114 env 
gene such as FBdelPRDSAF. as illustrated hereinafter, provide 
one example of suitable constructs. 

The novel retroviral packaging cells described hereinafter, 
have been designated FLY cells, and may be designed for in' 
vivo gene delivery. 

Considerable variations were found between the various cell 
lines screened for their ability to release type C mammalian 
retroviruses. In addition, few cell lines were able to 
produce retroviruses completely resistant to ~ human 
complement. Based on these two criteria, human fibrosarcoma 
HT1080 and rhabdomyosarcoma TE671 cells were selected for 
optimum construction of packaging cells. 
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Other' Studies have shown the importance of endogenous" 
retrovirus expression in the generation of recombinant 
retroviruses from retroviral packaging lines (Ronfort et ' 
al., Virology, (1995), 207, 271-275, Vanin, E.F.et al., .-J 
Virol (1994) 68:4241-4250.). The CO -packaging of . an 
endogenous genome and a vector can lead to emergence of 
recombinant, retroviruses (Vanin et al., supra). 
Recombination involves template switching during reverse 
transcription of such hybrid retroviruses (Huet al.. 
Science, (1990) 250:1227) and homologies between the two 
genomes considerably enhance the frequency of reverse 
transcriptase jumps (Zhang et al., J. Virol. (1994) 68: 
2409-2414). Therefore an ideal packaging cell line should 
not e:q)ress endogenous MLV-like (or type C retrovirus -like) 
retroviral genomes .which can be packaged by type C gag 
proteins (Scadde.n et al., J. Virol. (1990) 64: 424-427, 
Torrent, et al.-, J. Mol . Biol. (1994) 240 434-444). 

Packaging of human endogenous retroviral RNA was not 
detected in TELCeB and FLY packaging cells when virion 
associated RNA was analysed by RT- PGR using "generic primers. 

HT1080- and TE671 derived packaging cell lines may be safer 
in this respect than those generated from NIH3T3 cells, such 
as GP+EAM12 cells, which are known to express and package 
sequences related to type C retroviruses (Scadden et al, 
supra) . 

To generate the FLY packaging cell lines, HT1080 cells were 
transfected with gag-pol and env expression plasmids 
designed to optimise viral protein expression. Direct 
selection for viral gene expression was achieved in 
accordance with the invention by expression of a selectable 
marker gene by re -initiation of translation of the mRNA 
expressing the viral proteins.. This strategy resulted in 
packaging cell lines capable of producing extremely high 
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titer viruses. Furthermore, long-term expression of 
packaging functions can be maintained in these cells. Many 
unnecessary viral sequences were eliminated from the 
packaging constructs to reduce the risk of helper virus 
generation; indeed the final packaging cells did not produce 
helper virus, in that no replication competent virus (RCR) 
could be detected per lO'- vector particles. 

The FLY packaging cells described herein are safer than, for 
example, psiCRIP cells, at least for generation of env 
recombinant retroviruses as is illustrated in Table 4 
hereinafter, probably because less retroviral sequences 
overlapping with the. vector were present in the present env- 
expression plasmid. Few reports have addressed the question 
of the characterization of recombinant retroviruses (RVs) 
(Cosset, F.L., et al.. Virology (1993) 193:385-395). It is 
possible that such RVs could not be detected in previous 
packaging cell lines due to lower overall titers. RVs are 
defective in normal cell culture conditions but are likely 
to evolve to replication competent viruses if they are 
allowed to replicate in cells complementing their expression 
like co-cultivated packaging cells (Bestwick et'al., Proc. 
Natl Acad Sci USA, (1988) 85: 5404-5408, Cosset et al.,. 
(1993) supra) . ' 

In preferred retroviral packaging systems according to the 
invention, RVs are eradicated for example by removal of 
viral LTRs from the packaging construct. 

Consistent with our previous studies ( Takeuchi , , Y . , et al., 
J Virol (1994) 68:8001-8007), LacZ (RDllC) and lacZ (MLV-A) 
pseudotypes produced from HT1080 and TE671cells were more 
resistant to human complement than, LacZ (RD114 ) or LacZ(MLV- 
A) pseudotypes produced by 3T3 of dog cells. It was 
therefore decided to use RD114 and MLV-A env genes to 
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generate recombinant virions with MoMLV cores . 

The sequence of RD114 env gene was determined and, is shown 
in Figure 4. It was found to be very close to BaEV (baboon 
endogenous virus) a type C retrovirus (Benveniste, R.E.et 
al., Proc. Natl. Acad. Sci. USA (1973) 70:3316-3320; Kato, 
S.et' al., Japan. J. Genet. (1987) 62:127-137) with an 
envelope gene displaying similarities to the external part 
of type D simian retroviruses (SRVs) . RD114 uses the SRV 
receptor on human cells (Sommerfelt & Weiss, Virology 
(1990) 176:58-69; Sommerfelt, M.A. et al., J Virol ' (1990) 
64:6214-6220) making the FLY packaging cells with RD114 
envelope capable of generating virions with different 
tropism. Retroviral vectors prepared so far for human gene 
therapy have used either MLV-A or GALV (gibbon ape leukemia 
virus) envelopes which display some similarities (Battini, 
J.L/,et al., J Virol. (1992) 66:1468-1475) and which use two 
related cell surface receptors for infection (Miller, D.G, 
et al., J Virol (1994) 68:8270-8276). Differences in tissue- 
specific expression of MLV-A or GALV receptors have been 
reported (Kavanaugh et al.,. Proc Natl Acad Sci USA (1994) 
91:7071-7075) . 

The invention will now be particularly described by way of 
example with reference to the accompanying drawings in 
which: 

Figure 1 . illustrates the structure and expression of CeB. 
The. env gene (Xbal-Clal) of plasmid pCRIP was removed and 
was replaced by coinsertion of the two fragments Xbal-Sfil 
(restriction sites- underlined) from pOXEnv and a "sfil-Clal 
PGR product containing the bsr selectable marker. This 
results in positioning the bsr start codon (shadowed) 74 bp 
downstream to the pol stop codon (bold) .. 
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Open triangle are start codons (gag and fesr) , black ' 
triangles are stop codons (sol and bsr) . The shadowed 
triangle is the start codon of env, in the same reading 
frame with that of .bsr. SD and SA are the splice donnor and 
splice acceptor sites. 

Figure 2 illustrates the structure and expression of 
FbdelPASAF. 

Immediately after the stop codon of ^ (bold) was inserted 
a non retroviral Kasl-Ncol (restriction sites underlined) 
linker which positions the phleo start codon (shadowed) .76. 
bp downstream. 

Open triangle are start codons (^ and phleo ) , black 
triangles are stop codons (gnv and phleo ) . SD and SA are 
the splice donnor and splice acceptor sites. 

Figure 3 illustrates plasmids for . expression of An?)ho, Eco, 
RD114, Xeno, lOAl, GALV, VSV-G and FeLVB envelopes. 
All genes are expressed in the same backbone as detailed in 
fig. 2. The Bglll sites for ecotropic (MoMLV strain), 
lOAl, xenotropic {NZB.1.V6 strain) and amphotropic (4070A 
strain), the Ndel site of RD114 (SC3C strain, the BamHl site 
for both FeLVB and GALV were used as 5' ends,, and linked to 
Mscl site immediately after the splice donor site in the 
leader of FB29 LTR. 

Figure 4 shows the sequence of the RD114 env gene (SEQ ID No 

Figure 5 shows the genetic structure of gag-pol constructs. 
Initiation W and termination (T) codons are shown. The 
thick dotted line below each construct shows MLV-derived 
sequences. Nucleotide positions of MLV-derived sequences are 
shown according to: Shinnick et al . (1981) (from nt 1 to nt 
6000 with deletion of the packaging signal (DY) from Ball 
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(nt 215) to -Pstl (nt-568), and with some further MoMLV 
sequences in both CeB and CeB DS- fromnt 7676 to nt 7938. 
gag-pol and bsr genes were expressed from the same 
transcription unit using the either a retroviral promoter 
(Mo LTR) or a non retroviral promoter (hCMV) and non 
retroviral polyadenylation sequence (polyA) . Splice donor 
(SD) and acceptor (SA) sites are indicated. The thin line 
denotes retroviral non coding sequences. The thick line 
shows the rabbit beta-l globin intron B. The' position of 
■some restriction sites is indicated. 

The nucleic acid sequences of portions of constructs (as 
shown in Figure 5 (boxed areas)) are displayed for .CeB (SEQ 
ID No 2, Figure 6)., hCMV+ intron ( SEQ ID No 3, Figure 7) and 
hCMV+intronkaSD (SEQ ID No 4, Figure 8) . 



The nucleic acid sequences of portions of constructs (as 
shown in Figure 3 (boxed areas) ) are displayed for 
FbdelPASAF (SEQ ID No 5, Figure 9), FbdelPMOSAF (SEQ ID No 
6, Figure 10) , FbdelPGASAF (SEQ ID No 7, Figure 11) , 
FbdelPRDSAF. (SEQ ID No8, Figure 12) and CMVlOAl (SEQ ID No 
9, Figure 13) are shown. 

The components of the viral particles are produced by two 
independent expression plasmids (gag-Eol or 'env) which also 
contain selectable markers (bsr or phleo ) expressed from the 
same transcriptional units as gag-pol or env (figs. l& 2). 
The selectable markers are located downstream to aag-sol or 
env genes and there is an optimal distance between the stop 
codon^of the upstream reading frames and the start codon of 
the selectable genes that should allow re- initiation of 
translation (Kozak, Mol Cell Biol . (1987) 7, : 3438-3445) . 
Because there is no "Kozak" sequence (Kozak, Cell, (1986) 44: 
283-292) required for a normal initiation of translation for 
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the marker gene, they can only be expressed by re -initiation 
. of translation after the upstream viral gene has been ' 
successfully expressed. Consequently and also because re- 
initiation of translation is a poorly efficient process, 
after transfection of these plasmids, cells resistant to the 
drugs corresponding to those selectable genes express high 
levels of the viral proteins. 

To avoid. viral transmission of these "helper" genomes the 
constructs used suitably have the classical deletions of 
both the packaging. sec[uence located in the leader region and 
of the 3'LTR, the. latter being replaced by SV40 ■ " 
polyadenylation sequences (Figs 1 & 2) . 

Plasmid CeB is the MoMLV gag-pol -expression .unit . It 
derives from pCRIP, a plasmid used to generate the 
constructs introduced in the CRIP and CRE packaging cell" 
lines (Danos and Mulligan, 1988) . As shown in fig. i for 
generation of plasmid CeB the gnv gene of pCRIP has been 
deleted mostly and thefesr selectable marker, -encoding a 
protein conferring resistance to blasticidin (Izumi et al.. 
Experimental Cell Research (I99i) 197, 229-233)- has been ' 
inserted downstream to 20I gene . There are exactly 74 bp 
with no ATG triplets between the stop codon of pol and the 
start codon of bsr, this allows its expression by re- 
initiation of translation on the gag-pol mRNA, after 
translation of the aag-Egl reading frame. 

FbdelPASAF is a plasmid expressing the amphotropic' env gene 
and the Eilleo selectable tfiarker conferring resistance to 
phleomycin (Gatignol et al . , FEES Letters (1988) 230:171- 
175). By using a PCR-mediated mutagenesis strategy which 
modifies the end of env gene (see fig. 2) , a 76 bp linker 
was inserted between the stop codon of env and the start 
codon of Ehleo. This allows expression of phleo from the 
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env ,mRNA by re- initiation of translation. In addition 
compared to known env- expressing constructs, this strategy 
of construction has reduced the length of sequences 
overlapping with the ends of ^conventional retroviral 
vectors. The env genes of Mo-MLV, FeLVB, NZB.1V6, lOAl, 
GALV and RD114 are expressed by plasmids FBdelPMoSAF, 
FBdelPBSAF, FBdelXSAF, FBdelpGSAF, FBdelplOAlSALF and 
FBdelPRDSAF, respectively, by using the same backbone as 
FBdelPASAF (fig. 3). Retroviral vectors produced with the 
RD114^ envelope, will, be useful for in vivo gene delivezn^^ as 
comparatively to MLV ecotropic or amphotropic envelopes., 
virions pseudotyped with RD114 envelopes are not inactivated 
by human complement when they are produced by Mink Mv-l-Lu 
cells or by some human cells (Table 1) . 

The HT1080 cell line, isolated from a human fibrosarcoma • 
(ATCC CCL121) . The TE671 cell line isolated from a human 
rhabdomyosarcoma (ATCC CRL 8805) (purchased from ATCC, and 
tested for absence of usual cell culture contaminants by 
ECACC) , has been used for the definitive construction of 
packaging cell lines. HT1080 line was chosen among a panel 
of primate and human- lines because MLV-A and RD114 
efficiently rescued retroviral vectors from these cells and 
also because RD114 pseudotypes produced by this cell line 
were stable when incubated in human serum. In a standard 
assay (Takeuchi et al., J Virol (1994), 68, 8001-8007), 
these latter viruses were found more than 500 fold more 
stable than similar pseudotypes produced in 3T3 cells. 

Another advantage for the use of.non murine cells to derive 
packaging lines is the absence of MLV- related endogenous 
retroviral -like sequences (like VL30 in 3T3 cells) that can 
cross-package with MLV-derived retroviral vectors (Torrent 
et al., 1994) and generate potentially harmful recombinant 
retroviruses . 
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The helper constructs were introduced into other cell" lines 
{HT1080 (table 2) Mink Mv-l-Lu (table 2), 3T3 (not shown), 
TE671 (table 2)) for the purpose of comparisons of the 
efficiency of the constructs , 

As illustrated hereinafter (Table 2) , the reverse ' ' 

transcriptase (RT) activity' (provided by expression of the 
pol gene) in cells transfected with CeB is significantly 
higher than that of the same cells transfected by the 
parental plasmid pCRIP or that of cells chronically infected 
by MLV. This enhancement of viral gene expression is 
correlated with the titers of lacZ retroviral vectors when 
an envelope is provided in CeB-lacZ cells after comparison • 
with titers of lacZ pseudotypes of either replication- ^ 
competent viruses or other helper- free packaging systems. 

For the generation of final packaging cell lines,' the 'best 
clonal env transf ectants have been selected. Packaging 
systems, obtained in this way will be able to produce helper- 
free retroviral vectors at titers greater than 10* 
infectious particles per ml, which would be 10-100 fold 
higher to helper-free preparations .of others. 

Because of the way the selectable markers are expressed (see 
above), growing the packaging cells in phleamycin and 
blast icidin selective pressure increase and stabilize the 
expression of. the retroviral components and particularly the 
envelopes, as it is possible that env glycoproteins have 
toxic effects for the producer cells in the long. term which 
may lead to a decrease of expression. 

Such an enhancement of viral production observed with the 
packaging systems described herein might increase the 
•emergence of unwanted retroviruses having recombined between 
the genomes of both the retroviral vector and either of the 
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two packaging-deficient constructs. However, the constructs 
have been designed in such a way that it reduces -the 
probability of emergence of recombinant viruses compared to 
•the parental constructs. To check their safety, attempts 
have been made to detect the presence of replication- 
competent retroviruses by' a mobilisation assay of a lacZ " 
provirus. No RC viruses have been. found in all retroviral 
vector preparations tested so far. 

The following Examples illustrate the invention. 
Example 1 

Preparation of Cell lines and viruses. 
The following cell lines were used: 

A204 (ATCC HTB 82), HeLa (ATCC CCL2) , HT1080 (ATCC. CCL121) , 
MRC5 (ATCC CCL171), T24 (ATCC HTB 4), VERO (ATCC CCL81) and 
D17 (ATCC CCL183) were purchased from ATCC. 

HQS, TE671 and Mv-l-Lu cells and their clones harboring 
MFGnlslacZ retroviral vector as described by Takeuchi et 
al., J Virol (1994), 68, 8001-8007. 

The above cell lines were grown in DMEM (Gibco-BRL, U.K.) 
supplemented with 10% fetal calf serum.' 

EB8 (Battini et al . , J, Virol (1992) 66: 1468-1475); 
psiCRE, psiCRELLZ and psiCRIP (Danos et al., Proc. Natl. 
Acad. Sci USA (1988) 85: 6460-6464); 

Cells GP+EAM12 (Markowitz et al . , Virology (1988), 167, 400- 
406) ; and 

NIH-3T3 murine fibroblasts. 

These cell lines .were grown in DMEM (GIBCO-BRL, U.K.) 
supplemented with 10% new-born calf serum. 
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Mv-l-Lu, TEG71 and W1O8O cells were transfected using 
calcium-phosphate precipitation method (Sambrook et . , 
"Molecular Cloning" 1989, Cold Spring Harbour Laboratory 
Press: New York) as described elsewhere (Battini et al., 
supra) . ■ CeB-transfected Mv-l-Lu, TE671 and HT1080 cells 
were- selected with 3, 6-8 and' 4 /xg/ml of blasticidin, S (ICN, 
UK) , . respectively, and blasticidin-resistant colonies were 
isolated 2-3 weeks later. Cells .transfected with the various 
env-expression plasmids were selected with phleomycin 
(CAYLA, France) : 50 fig/ml (for FBASALF- transfected cells) or 
10 fig/ml (for FBASAF- , FbdelPASAF-, FbdelPMOSAF, 
FBdelPIOAISAF or FBdelPRDSAF- transfected cells) . Phleomyc in- 
resistant colonies were isolated 2-3 weeks later. 

Production of lacZ pseudotypes using replication competent 
vioruses, amphotropic murine leukemia virus (MLV-A) 1504 
strain and cat endogenous virus RD114 , was carried out as 
described previously (Takeuchi et al . , J Virol (1994), 68, 
8001-8007). 

Example 2 

Preparation of Plasmids. 

The env gene of pCRIP (Danos et al., supra) was excised by 
Hpal/Clal digestion. A 500 bp PCR-generated DNA fragment was 
obtained using pSV2-bsr (Izumi' et al., Experimental Cell 
Research (1991), 197, 299-233) as template and a pair of 
oligonucleotides: * ' 

(5 ' >CGGAATTCGGATCCGAGCTCGGCCCAGCCGGCCACCATGAAAACATTTAACATTTC 
TO (SEQ ID NO 2) at 5 ' end and 

(5' >GATCCATCGATAAGCTTGGTGGTAAAACTTTT) (SEQ' ID No 3 ) at 3 ' 
end, with Sfil and Clal sites, respectively. This ' fragment 
was inserted in Hpal/Clal sites of "pCRIP by co-ligation with 
a 85 bp Hpal/Sfil DNA fragment isolated from pOXEnv (Russell 
et al.. Nucleic Acids Research (1993), 21, 1081-1085) which 
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provides the end of the Moloney murine leukemia virus" 
(MoMLV) pol gene. The resulting plasmid named CeB (Fig. i) 
could express the MoMLV gag-pol gene as well as the bsr 
selectable marker conferring resistance to blasticidin S, 
both driven by the MoMLV 5'LTR promoter. 

A series of env- express ion plasmids was generated using the 
.4070A MLV (amphotropic) env gene (Ott et al.., j Virol 

(1990) , 64, 757-7.66) and the FB29 Friend MLV promoter ' 
(Ferryman et al., Nucleic Acid Res (1991),. 19, 6950). In 
FBASALF (Fig. 1) a Bglll/Clal fragment containing the env 
gene was cloned in BaimHI/Clal sites of plasmid FB3LPh. which 
also contained the C5 7 Friend MLV LTR driving the expression 
of the phleo selection marker. A 136 bp env fragment was 
generated by PCR using plasmid FB3 (Heard et al. , j virol 

(1991) , 65, 4026-4032) as template and a pair of 
oligonucleotides: (5' >GCTCTTCGGACCCTGCATTC) (SEQ ID NO 4) at 
5' end (before Clal site) and 

(5'>TAGCATGGCGCCCTATGGCTCGTACTCTATAGGC) (SEQ ID NO 5) at 3 ' 
end, providing a KasI restriction site immediately after the 
env stop codon. This PCR fragment was digested using Clal 
and Kasl. A DNA fragment containing the FB2 9 LTR and the 
MLV-A env gene was obtained by Ndel/Clal digestion of 
FBASALF. The fragments were co-ligated in Ndel/KasI digested 
pUT626 (kindly provided by Daniel Drocourt, CAYLA labs, 
France). In the resulting plasmid, named FBASAF (Fig. l) , 
the phleo selectable marker was expressed from the same mRNA 
as the env gene. A Bglll restriction site .was created after 
the MscI site at position 214 in the FB29 leader by using a 
commercial linker (Biolabs, France) . A Ndel/Bglii fragment 
containing the FB29 LTR was co-inserted with the Bglli/Clal 
env fragment in Ndel/Clal -digested FBASAF plasmid DNA, 
resulting in plasmid FBdelPASAF (Fig. i) . Compared to 
FBASAF, FBdelPASAF has a lOObp larger deletion in the leader 
region . 
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Example 3 



Cloning and Sequencing of- the RD114 env gene 
The RD114 env gene was first sub-cloned in plasmid 
Bluescript KS+ (Stratagene) as a .3 Kb Hindlll insert 
isolated from SC3C, an RD114 infectious DNA clone (Reeves et 
al., J. Virol (1984), 52, 164-171). A 2.7 kb Seal-Hind III 
fragment .of this subclone containing the RD114 env gene was 
sequenced (Figure 4 (SEQ ID NO 1)- EMBL accession number; 
X87829). The 5' non-coding sequence upstream of . an Ndel- site 
was deleted by an EcoRI/Ndel digestion followed by filling- 
in with Klenow enzyme and self -ligation. From. this plasmid, ■ 
. two DNA fragments-were obtained: a BaraHI/NcoI 2.5 Kb. . 
fragment and a 63 bp PCR-generated DNA fragment using 
(5'>CGCCTCATGGCCTTCATTAA) (SEQ OD NO 6) at 5' end (before 
NotI site) and (5' >TAGCATGGCGCCTCAATCCTGAGCTTCTTCC) (SEQ ID 
NO 7) at 3' end, providing a KasI restriction site just 
after RD114 env gene stop codon. The PGR fragment was 
digested with Ncol and. KasI. Both fragments were co- 
inserted between Bglll and KasI sites of FBdelPASAF and the 
resulting plasmid was named FBdelPRDSAF (Fig. i) . 

Plasmid pCRIPAMgag- (Danos, O. et ai., Proc Natl Acad 
Sci USA (1988) 85:6460-6464) was used for transf ection . 

Example 4 
Infection assays. 

Target cells were seeded in 24-multiwell plates (4x10* cells 
per well) and were incubated overnight.- Infections were then 
carried out at 3 7oc by plating 1 ml dilutions of viral 
supematants in the presence of 4 ^g/ml polybrene (Sigma) on 
target cells. 3h later virus-containing medium was replaced 
by fresh medium and infected cells were incubated for two 
days before X-gal staining, performed as previously 
described (Tailor et al., J Virol (1993), 67, ' 6737-6741 
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Takeuchi et al., J Virol (1994), 68, 8001-8007). Viral 
titers were determined by counting lacZ-positive colonies as 
previously described (Cosset et al., J, Virol. (1990) 64: 
1070-1078). Stability of lacZ pseudotypes in fresh human 
. serum was examined. by titrating surviving virus after 
incubation in 1:1 mixture of virus harvest iri serum- free ' 
medium and fresh human serum for 1 h at 37«C' as described 
before CTakeuchi et al . supra). 

Example 5 

Reverse transcriptase (RT) assay. 

RT assays were performed either as described previously 
(Takeuchi et al . supra) or using an RT assay kit (Boehringer 
-Mannheim, U.K.) following the manufacturer's instruction but 
using MnClj (2 mM) instead of MgClj. 

Example 6 

Screening producer cell lines. 

Viral particles generated with RD114 envelopes have been 
found to be ^ more stable in human serum than virions with 
MLV-A envelopes and that the producer cell line also 
controls sensitivity (Takeuchi et al.' supra) . A panel of 
cell lines was screened for their ability to produce high 
titer viruses and for the sensitivity of these virions to 
human serum. To do this, cells were infected at high 
multiplicity with lacZ pseudotypes of either MLV-A or RD114 
and cells producing helper-positive lacZ pseudotypes were 
established. Human HT1080 and TE671 and mink Mv-l-Lu cells 
were found to release high titer lacZ(RD114) and lacZ (MLV-A) 
viruses. LacZ (MLV-A) pseudotypes produced by HT1080 cells 
were more resistant to human serum than those produced by 
other cells. The titer of these viruses was only four-fold 
less following a 1 hr incubation with human serum than a 
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control incubation (Table 1) . LacZ(RD114) pseudotypes" 
produced by human cells or mink Mv-l-Lu cells were in 
general stable in human serum (Table 1) . These results 
suggested that HT1080, TE671 and Mv-l-Lu 'cells provided the 
best combination of high lacZ. titers and resistance to human 
serum and they were therefore used for the generation' of 
retroviral packaging cells. 

Table 1. Titer and stability of lacZ pseudotypes. 



Producer 
cell 



LacZ (MLV-A) 



Titer* 



Lac2(RDn4) 



Stability^ Titer' 



Stability' 



A2 04 


650 


<3 


1,200 


105 


HeLa 


9 


nd 


2,000 


115 


HOS 


4, 500 


6 


23, 000 


86 


HT1080 


~ 2,000,000 


26 


4.00, 000 


129 


MRC-5 


450 


10 


1,000 


nd 


T24 


350 


nd 


1,200 


nd 


TE671 ^ 


15, 000 


2 


, 90,000 


38 


VERO 


260 


nd 


90 


nd" 


D17 


900 


<1 


200, 000 


1 


Mv-l-Lu 


80,000 


1 . 


200, 000 


120 



a: titration on TESTl cells as lacZ i.u./ml 
b: % of infectivity of human serum- treated viruses compared to fetal calf 
treated viruses 



Example 7 



Construction of an improved gag-pol expression vector.. 
A MoMLV gag-pol expression plasmid, CeB (Fig. i) , was 
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derived from pCRIP (Danos et al . , Proc. Natl. Acad Aci USA 
(1988) 85: 6460-6464). Approximately 2 Kb of env sequence 
were removed from pCRIP and the. bsr selectable marker, 
conferring resistance to blasticidin S (Izumi et al., 
5 Experimental Cell Research (1991) 197:229-233), was inserted 

74 nts downstream of the gag-pol gene. This 74 nts interval - 
had no ATG triplets and was thought to provide an optimal 
distance between the stop codon of the pol reading frame and 
the start codon of the bsr gene .to allow re- initiation of 

10 translation (Kozak Mol Cell Biol., 1987, l- 3438-3445). 

There was no "Kozak" consensus sequence (Kozak Cell, (1986) 
44: 283-292) at the 5' end of the marker gene. Therefore, ' 
bsr could only be expressed by re-initiatiori of translation 
after the upstream gag-pol gene had been expressed. 

15 Consequently, after transfection of CeB in Mv-l- 

Lu/MFGnlsLacZ (ML), TE67l/MFGnlsLac2 (TEL) or HT1080 cells, 
blasticidin S-resistant bulk populations and most cell 
clones expressed high levels of gag-pol proteins assessed by 
the reverse-transcriptase (RT) activity found in cell 

20 supernatants (Table 2) . Considerably higher RT activities 

were found in bulk populations of CeB-transf ected ML cells 
compared to bulk population of ML cells stably transf ected 
with the parental pCRIP . construct . Similarly the RT 
activities of two packaging cell lines generated using 

25 pCRIPenv- construct, psiCRE cells (Danos et al . , supra) and 

EB8 cells (Battini supra.) were less than that of CeB 
transf ected clones (Table 2). Finally, RT.activitiy in CeB 
.transf ected cell supernatants was higher than that of cells 
chronically infected by replication-competent MLV-A (Table 

30 2) . 

Table 2. Secreted reverse transcriptase expression 



RT activity^ LacZ Titer'= 
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ML/MLV-A 


•1 




8x10^ 


MLSvB 


U 


. 1 


<1 




0 


. 15 


nd 


roiiueB (bulk) 


1 


. 7 


nd 




4- 


. 2 


ixlO^ 




1 , 


• ^ 


1x10^ 


Ihjij/iyLLV-A 


3 , 


.6 


2x10^ 




5 . 


. 2 


4x10' 


n J. a. u o u / jyiij V - A 


, 1 . 


, 1 


IxlO^ 


HTCeBb 


1. 


9 


1x10^ 


HTCeBlS 


2. 


7 


2x10^ 


HTCeB22 (FLY) 


6. 


9 


5x10^ 


HTCeB48 


5. 


5 


3x10^ 


EBB 


0. 


22 


lxlO\ 


psiCRE-LLZ 


1. 


2 


ixio^^ 



a: ML, Mv-l-Lu cells harboring a MFGnlslacZ provirus,- TEL, TE671 cells harboring 
a MFGnlslacZ provirus; /MLV-A, cells chronically infected- with MLV-A 1504 
strain; MLSvB, ML cells transfected with a plasmid pSV2bsr alone; MLCRIP, ML 
cells co-transfected with pCRIP and pSV2bsr. 

b: Average of arbitrary units relative to ML/MLV-A RT activity of at least two 
independent experiments was shown. The standard errors did not exceed 20 % of 
the values . 

c: titration on TE671 cells is lacZ i.u./ml. After polyclonal transfection of a 
plasmid which expresses MLV-A env in MLCeB clones, TELCeB clones, HTCeB clones 
and EB8 cells; nd, not done, 
d: titration on N1H3T3 cells 



To rescue infectious lacZ viruses, MLCeB and TELCeB clones 
were transfected with FBASALF DNA, a plasmid designed to 
express the MLV-A env gene (Fig. 1) . Bulk populations of 
stable FBASALF transf ectants were isolated and supematants 
were titrated using TE671 cells as targets. Titers of lacZ 
viruses were higher than either MLV-A infected ML or TEL 
cells, or FBASALF- transfected EB8 cells (Table 2) . These 
data suggested that CeB was an extremely efficient MLV gag- 
pol expression vector in mink Mv-l-Lu and TEe71 cells. CeB 
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10 



15 



was therefore used to derive packaging cells by transfection 
of HT1080 cells. 41/49 blasticidin S-resistant colonies had 
detectable levels of RT; 9 had RT. activity higher than that 
of control MLV- A- infected HT1080 cells (data not shown).. 
Expression of gag precursor was confirmed in cell lysates 
and supernatants of these 9 HTCeB clones by immunoblotting 
using antibodies against p30-CA (data not shown), The 4 
clones with the highest expression of gag proteins (clones 
6,18,22 and 48) were infected at high-multiplicity with 
helper free, lacZ pseudotypes bearing MLV-A envelopes 
(MFGnlslacZ (A) ) produced by TELCeBS/FBASALF (Table 3) and 
then transfected with FBASALF. Supernatants of bulk, 
phleomycin-resistant transf ectants were assessed for RT 
activity andlacz titer (Table 2). Clone HTCeB22, named FLY, 
was found to be the best gag-pol producer clone and was used 
to introduce env expression vectors for the generation of 
packaging cell lines. ... 
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Table 3. Titer following env construct transfection 



Producer- cell . 
psiCRIP lacZ 5 



Env -source 



Titer^ 



GP+EAM12 lacZ 25 
TELCeBS 

TELCeBG 



FLY** 



FLYA4 lacZ 3 
FLY** 



pCRIPAMgag- 

envAM 

FBASALF^ 

FBASAF*= 

FbdelPASAF^ 

FBdelPASAF 1 
FbdelPASAF 4 
FbdelPASAF 6 
FbdelPASAF 7 
. FbdelPASAF .8 
FbdelPRDSAF 2 
FbdelPRDSAF 4 
FbdelPRDSAF 7 
FbdelPRDSAF 8 

FBdelPASAF .1 
FbdelPASAF 4 
FbdelPASAF 5 
FbdelPASAF 1 
FbdelPASAF 13 
FbdelPASAF 14 
FbdelPASAF 15 
FbdelPASAF 16 
FbdelPASAF 17 

FBdelPASAF 4 

FBdelPRDSAF 1 
FbdelPRDSAF 2 
FbdelPRDSAF 6 
FbdelPRDSAF 10 
FbdelPRDSAF 11 
FbdelPRDSAF 13 
FbdelPRDSAF 17 
FbdelPRDSAF 'l8 
FbdelPRDSAF 19 



"Txio^ 

3x10^^ 

5x10^ 
2x10^ 
2x10' 

3x10'' 
2x10*^ 
IxlO'' 
5x10' 
1x10' 
IxlO^ 
3x10^ 
1x10' 
2x10* 

1x10^ 

1.5x10* 

1x10* 

1x10* 

7x10* 

4x10* 

1x10* 

5x10* 

6x10* 

2x10'** 

2 .5x10* 

1x10' 

5x10* 

2x10* 

3x10* 

IxlO* 

5x10* 

3x10' 

6x10* 



Average titers of at least three independent experiments were shown". The 

standard errors did not exceed 30 % of the titer values. 

a: titrated on TE671 cells as lacZ i.u./ml 

b: results of best MFGnlslacZ producer clones. 
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c: bulk populations of env-transfectants in TELCeBS cells. 

d: titration after bulk infection with helper-free MFGnlslacZ. 

Example 8 

Construction of env expression vectors. 
A series of MLV-A env expression plasmids were then 
generated (Fig. i) . in FBASALF, the env gene was inserted 
between two Friend-MLV. LTRs, its expression driven by the 
FB29 MLV LTR (Ferryman et al . , supra) . Most of the packaging 
signal located in the leader region was deleted. This 
plasmid also expressed the phl'eo selectable marker (Gatignol 
et al., supra) driven by the 3' LTR. FBASAF and FBdelPAS.AF 
were then designed following the same- strategy used for CeB. 
These two vectors differed only by the extent of deletion of 
the packaging signal, FBdelPASAF having virtually no leader 
sequence. Compared to pCRIPAMgag- and pCRIPgag-2 env 
plasmids expressed in psiCRIP or psiCRE packaging cells 
(Danos et al . , supra) about 5 Kb of gag-pol sequences was 
removed. In addition the 258 bp retroviral sequence 
containing the end of env gene and the begining of U3 found 
in pCRIPAMgag- and pCRIPgag-2 was also removed. For both 
FBASAF and FBdelPASAF. plasmids, the phleo selectable marker 
was inserted downstream of the env gene by positioning a 76 
nts linker with no ATG codons between .the two open-reading 
frames. Phleo could therefore only be, expressed by re- 
initiation of translation by the same ribosomal unit that 
had expressed the upstream env open reading frame. 
FBdelPASAF was also used to generate FBdelPRDSAF, an RD114 
envelope expression plasmid (Fig. 1) . 

After transfection of the env plasmids into TELCeB6 cells 
(Table 2) , bulk populations of phleomycin-resistant colonies 
were isolated and their production of lacZ virus measured 
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(Table 3). FBASALF gave a titer of 5x10' lacZ-i . u. /ml whilst 
titers with either FBASAF or FBdelPASAF were 2x10' lacZ- 
i.u./ml (Table 3). Titers of 5x10' or 10' lacZ-i.u./ml could 
be obtained with some FBdelPASAF cell clones or FBdelPRDSAF 
clones, respectively. 

As FBdelPASAF has minimal virus -derived sequences and was 
shown to be the safest construct (see below and Table 4), 
it and FBdelPRDSAF were used to generate packaging lines 
from FLY cells (clone HtCeB22, Table 2). Envelope expression 
of these clones was assayed by interference to challenge 
with MFGnlslacZ (A) or MFGnlslacZ (RD) pseudotypes produced by 
TELCeB6 /FBdelPASAF- 7 or TELCeBS /FBdelPRDSAF- 7 , respectively 
(Table 3) . The cell lines showing most interference were 
cross-infected at high multiplicity with these pseudotypes 
to provide MFGnlslacZ proviruses, and supematants were then . 
titrated on TE671 cells (Table 3). FLY-FBdelPASAF-13 (FLYAIS 
packaging line) and FLY- FBdelPRDSAF- 18 (FLYRD18 packaging 
line) gave the highest productions of lacZ viruses, around 
10' lacZ-i.u./ml. The best MFGnlslacZ producer clones derived 
from either psiCRIP cells (Danos et al . , supra) or GP+EAM12 
cells (Markowitz et al., supra) gave approximately 50 fold 
lower titers (Table 3) . The lacZ titers of the FLY-derived 
lines shown in Table 3 are lower than the best TELCeB6- 
derived lines after transfection of either FBdelPASAF or 
FBdelPRDSAF (Table 3) . However it should be noted that the 
lacZ provirus expressed in TELCeBS cells was obtained after 
clonal selection but was introduced polyclonally in FLY- 
derived env-transfected cell clones. When FLY -FBdelPASAF -4 
cells (FLYA4 packaging line), infected with helper-free 
MFGnlslacZ (RD) , were cloned by limiting dilution the best 
clones (eg. FLYA4lacZ3) were found to produce 20 times more 
infectious viruses than the bulk population, reaching the 
range of titers obtained with the best TELCeB6- FBdelPASAF 
clones (Table 3) . 
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Exan5>le 9 

Assays for transfer of gag-pol or env functions. 

To assay for replication-competent viruses, supernatants 
were used to infect TEL cells .(a clone of TE671 cells 
harboring an MFGnlslac2 provirus) . Infected cells were 
passaged for 6 days or longer and. their supernatants were 
used for infection of fresh TE671 cells. No transmission of 
lacZ viruses could be detected (Table 4), demonstrating that 
the supernatants of pCRIPAMgag- - / FBASALF-, FBASAF-, or 
FBdelPASAF-transfected TELCeBG cells were helper-free. 
Similar absence of replication competent recombinant ' 
retroviruses was demonstrated using supernatant from a 
clone of psiCRIP-MFGnlslacZ cells or from two clones of- 
FLYA-MFGnlslacZ cells (Table 4) . 

There have been reports that helper- free retroviral vector - 
stocks may nevertheless contain recombinant retroviruses 
(replication incompetent) carrying either gag-pol or env 
genes (Bestwick et al . , Proc Natl Acad Sci USA (1988), 85, 
5404-5408, Cosset et al., Virology (1993 ) , 193) 385-395, 
Girod et al,. Virology (1995), in press) . To assay for such 
recombinant, retroviruses, mobilisation of an MFGnlslacZ 
provirus from two indicator cell lines which could cross - 
complement potential recombinant viruses carrying either 
gag-pol or env functional genes was attempted. The TELCeBG 
line (Table 2) expressing ^ gag-pol proteins was used as 
indicator cell line to test . for the presence of env 
recombinant (ER) viruses. The TELMOSAF indicator line 
expressing MoMLV env glycoproteins (obtained by transfection 
of FBMOSAF, a plasmid expressing the MoMLV env. gene using 
FBASAF backbone, in TEL cells) was used to detect the 
presence of gag-pol recombinant retroviruses (GPR viruses) . 
After passaging 4-8 days, the supernatants of the infected 
indicator cells were used to infect either human TE671 cells 
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or murine NIH3T3 cells. 



TELCeB6 cells transfected with various env- expressing 
constructs, pCRIPAMgag-, FBASAF and FBdelPASAF were 
compared; Although the supernatants of TELCeBG-FBdelPASAF 
cells, were devoid of replication-competent retroviruses, 
they were found sporadically to transfer gag-pol genomes ^ 
(Table 4) . No GPR viruses . could be detected when less than 
2x10^ virions were used to infect the indicator cells. 
Similarly TELCeBS indicator cells infected with various 
helper-free viruses were shown sporadically to release lacZ 
virions (Table 4) . The number depended both on the env- 
expression vector used and on the virus input quantity. 
Compared to lacZ viruses, generated using pCRIPAMgag- 
plasmid, the frequency of detection of the env- recombinant 
viruses was lower for supernatants generated by using FBASAF 
and FBdelPASAF constructs (Table 4) For FBdelPASAF 
construct when less than 5x10^ MFGnlslacZ (A) helper- free 
virions were used to infect the indicator cells, no ER 
retroviruses could be detected. From these experiments, it 
could be estimated- that a supernatant, produced from 
TELCeBG -FBdelPASAF cells, containing 1x10^ infectious units 
of MFGnlslacZ retroviral vector contained no replication- 
competent virus,- and about 100 gag-pol and 100 env 
recombinant retroviruses. 
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Table 4. Transfer of packaging fimction 





Producer cell 


Indicator cell 


Input vims* 


DetecliorP 


5 






(lacZ-i.u.) 


+ + 


+ 








RcDlication comoetent vims 








psiCRIP lacZ 5 


TEL 


2x10* 


0/4 


0/4 


.4/4 


10 


TELCeB6-pCRIPAMgag- 


TEL 


5x10* 


0/4 


0/4 


4/4 




i xiLLebo-r BASAF 


TEL 


5x10* 


0/4 


0/4 


4/4 




TELCeB6-FBdelPASAF 


TEL 


5x10* 


0/4 


0/4 


4/4 


15 


FLYA4 lacZ 3 


TEL 


1x10' 


0/4 


0/4 


4/4 




ri-YA4 lacZ 7 


TEL 


1x10' 


0/4 


0/4 


4/4 






Gas-Dol recombinant 










TELCeB6.FBdelPASAF 7 


TELMOSAF 


2x10' 


0/4 


1/4 


3/4 


20 


TELCeBo-FBdelPASAF 7 


TELMOSAF 


2x10* 


0/4 


2/4 


2/4 




TELCeBo-FBdelPASAF 7 


TELMOSAF 


2x10* 


0/4 


O /A 


z/4 




TELCeB6-FBdelPASAF 7 


TELMOSAF 


2x10* 


0/4 


0/4 


4/4 






Env recombinent 








25 


TELCeB6-pCRIPAMgag- 


TELCeB6 


5x10* 


2/4 


1/4 


1/4 




TELCeB6-pCRIPAMgag- 


TELCeB6 


5x10* 


1/4 


1/4 


2/4 




TELCeB6-pCRIPAMgag- 


TELCeB6 


5x10* ■ 


0/4 


2/4 


2/4 




TELCeB6-FBASAF 


TELCeB6 


5x10* 


0/4 


2/4 


2/4 


30 


TELCeB6-FBASAF 


TELCeB6 


5x10* 


0/4 


1/4 


3/4 




TELCeB6-FBASAF 


TELCeB6 


5x10* 


0/4 


1/4 


3/4 




TELCeB6-FBdelPASAF 


TELCeB6 


5x10* 


0/4 


1/4 


3/4 




TELCeB6-FBdelPASAF 


TELCeB6 


5x10* 


1/4 


3/4 


0/4 


35 


TELCeB6-FBdelPASAF 


TELCeB6 


5x10* 


0/4 


0/4 


4/4 




a: number of lacZ i.u. used to 


infect indicator 


cells 








40 


b: number of incidence out of 


four experiments. 


The ranges of 


lacZ titers 




rescued from infected indicator cells are shown 


for each virus 


input 




>100 




lacZ i.u. /ml (++) , 1-100 lacZ 


i.u. /ml (+) and <1 lacZ i.u. /ml 


(-) . 







Titers were determined on TE671 cells for replication 
competent virus and env recombinant and NIH3T3 cells for 
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gag-pol recombinant. 
Exanple 10 

In order to confirm resistance, to complement and absence of 
replication competent virus in our best packaging lines, 
MFGnlslacZ (A) and (RD) harvested from FLYA13 and FLYRD18, 
respectively, after polyclonal transduction of MFGnlslacZ 
(Table 3 above) were tested for stability in fresh human 
serum and generation of replication competent virus. Titers 
of MFGnlslacZ (RD) from FLYRDIS after 1 hr incubation with 3 
independent samples of fresh human serum were 80 to 120 % . 
of control incubations, while titers of MFGnlslacZ (A) from 
FLYA13 were 50 to 90 % of controls (data not shown) . No 
replication competent virus was detected in the same assay 
described above (Table 4) when 1 x 10'' i.u. each of 
MFGnlslacZ (A) and (RD) were tested. 

EXAMPLE 11. 

Generation of plasmids. 

CeB plasmid (Fig. 5) expressing MoMLV gag-pol gene, was 
further modified to remove the splice donor site located in 
the leader region, A 272 bp fragment was PCR-generated by 
using OUSD- (5 ' -TCTCGCTTCTGTTCGCGCGC) and OLSD- 
( 5 ' - TCGATCAAGCTTGCGGCCGCGGTGGtGGGTCGGTGGTCC ) as . primers and 
further digested with BssHII and Hindlll. A 'lOOS bp 
Hindlll-Xhol fragment isolated from CeB (encompassing a part 
of leader sequence and beginning MoMLV gag) and the PGR 
fragment were co- inserted into pCeB from which the 1275 bp 
BssHII-XhoI .fragment (encompassing R-U5- leader-gag) had been 
removed. The resulting- plasmid, named pCeB DS- (Fig. 5), 
beared the deletion of splice donor (SD) site and a NotI 
restriction site created just downstream to the lost SD 
site. 
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A series of gag-pol expression plasmids in which the MoMLV 
LTR promoter was replaced by the human cytomegalovirus 
immediate early promoter (hCMV promoter) was derived from 
both CeB DS- and hCMV-G (Yee et al . , 1994 PNAS, 91: 9564- 
9568) , a plasmid used as a source for the hCMV promoter. A 
Notl-filled/EcoRI 7260 bp fragment was isolated from CeB DS- 
and cloned into hCMV-G which had been opened with Sail 
(further rendered blunt-ended) and EcoRI to remove the VSV-G 
gene. The resulting plasmid was cutted with Clal and EcoRI 
to remove a 1155 bp fragment encompassing sequence derived 
from 3' -LTR and SV40 polyA sequence and self-ligated after 
filling both protruding DNA ends. The resulting plasmid, . 
named phCMV-intron (Fig. 5), had gag-pol and bsr ORFs 
inserted between the CMV promoter and rabbit beta-globin 
polyA post -transcriptional regulatory sequences. 

An intermediate plasmid was generated by sub-cloning a 7260 
bp EcoRI fragment (isolated from CeB DS-) into hCMVG opened 
with EcoRI. A 1155 bp fragment (encompassing sequence 
derived from 3' -LTR and SV40 polyA sequence) was removed 
from this intermediate plasmid which was then 
re-circularized by self ligation after filling both ends. 
The resulting plasmid, named phCMV+intron 2P (Fig. 5), was 
digested with NotI and the vector was treated with klenow 
enzyme. A 1440 bp fragment (encompassing hCMV promoter and 
rabbit beta-1 globin intron B (Rohrbaugh et al., 1985 Mol . 
Cell Biol, 5: 147-160)) was isolated from phCMV+intron 2P by 
Notl/EcoRI digestion. This fragment was further treated with 
klenow enzyme and ligated back into the vector. The 
resulting plasmid, named hCMV+intron (Fig. 5.) , could express 
gag-pol and bsr genes driven by the hCMV promoter and beared 
an intron sequence derived from rabbit beta-1 globin intron 
B having both SD and SA (splice acceptable) sites. 



35 



A 2450 bp fragment was removed from phCMV+intron 2P by 
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Notl/Xhol digestion. The resulting vector fragment was then 
used to co-ligate a 1330 bp fragment (containing hCMV 
promoter + 5' end of rabbit beta-l globin intron B (with SD 
site)) isolated from phCMVG by Apal- filled/Not I digestion 
and a 1 kb fragment- isolated from phCMV+ intron 2P by 
Notl-f illed/XhoI digestion. Compared to phCMV+intron 2P, the 
resulting plasmid, named hCMV+SD intron (Fig. 5) , had the 
deletion of the 3' end of the rabbit beta-l globin intron B 
and thus no SA site in the leader region. 

Construct phCMV+leader (Fig. 5) has been described elsewhere 
(Savard et al unpublished) This plasmid, in which gag-pol 
and bsr genes were driven by the hCMV promoter, had the 
MoMLV SD site in the leader region. 

Gag-pol expression. 

The different constructs, including the parental CeB 
plasmid/ were analysed comparatively in a complementation 
assay after transfection in TEL-FBdelPASAF cells expressing 
4070A-MLV (amphotropic) envelope and harboring a MFGnlslacZ 
provirus . The transient production of lacZ retroviruses as 
well as the stable production of lacZ retroviral vectors 
after selection with blasticidin S were determined (Table 
5) . All the constructs were able to rescue infectious lacZ 
retroviruses indicating the expression of gag-pol proteins 
after transient transfection. Most likely due to the 
efficient hCMV and rabbit beta-l globin intron B 
(post ) -transcriptional regulatory sequences, hCMV+ intron was 
particularly potent in transient retroviral vector 
production. However, 10 times less blasticidin-resistant 
colonies were obtained with hCMV+intron comparatively to 
CeB, and stable lacZ virus production from hCMV+intron was 
about 5-10 times lower than that of CeB. Clonal examination 
of lacZ retrovirus production from blasticidin-resistant 
colonies indicated that 80-50% of colonies could express 



wo 97/08330 



PCT/GB96/02061 



35 

high levels of gag-pol proteins for both hCMV+intron and CeB 
plasmids. In contrast, despite variation in their ability to 
form blasticidin-resistant colonies after transfection and 
■ despite their ability to express gag-pol proteins from 
5 transient transf ectants , all other constructs had a weak 

capacity for rescuing lacZ retroviral vectors from stable 
transf ectants (Table 5) . 



Table 5. Comparative study of gag-pol-bsr plasmids. 



gag-pol-bsr 
plasmid 


Transient 

(lacZ 

i,u./ml) 


no clones 
bsr" 


Stable 
(lacZ 
i - u . /ml 


% gag-pol 
/bsr 


Ceb 


300/ml 


50 


10^ 


90% 


Ceb DS- 


144 /ml 


5 


10^ 


50% 


hCMV+intron 
2P 


ND 


20 


10^ 


50% 


hCMV-intron 


812/ml 


0 






hCMV+SD 
intron 


150 /ml 


1000 


10^ - 


nd 


hCMV+leader 


328/ml 


1000 


lO^-lO^ 


nd 


hCMV+ intron 


12000/ml 


5 


10^-10' 


80% 



Northern blot analyses were performed on stable 

transf ectants (blasticidin-resistant ) obtained with some of 

25 the gag-pol-bsr plasmids. As expected, the results (not 

shown) displayed a correlation between expression of gag-pol 
mHNAs and gag-pol protein expression detected by rescue 
analysis (Table 5) . CeB construct was found to produce 2-3 
fold more gag-pol mRNAs compared to hCMV+ intron. 

30 Interestingly, an unexpected 2.45 kb RNA band .was found for 

hCMV+intron construct at a ratio of 2:1 compared to the 
abundancy of the gag-pol mRNA band (at 5.95 kb) . Further 
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investigations by using other probes revealed that a cryptic 
splice donnoriSD) site located in the gag gene (right in 
the middle of the CA coding region at position 1596-1597 
-numbering according to Shinnick et al,, 1981 Nature 
(London) 293: 543-548) was activated in this latter 
construct. The 2.45 RNA species, lacking the 3' half of the 
gag gene and most of the pol gene, is unlikely to give rise 
to any useful translational product. It is therefore 
interesting to notice that hCMV+intron construct was able to 
give rise to slightly more transcripts (gag-pol 5.95 mRNA + 
2,45 alternative RNA band) compared to gag-pol mRNA 
expressed from CeB construct. Therefore we decided to 
inactivate the cryptic SD site in the hCMV+intron construct 
in order to increase the ratio of gag-pol mRNAs . 

Assays for transfer of gag-pol functions. 

Although the supernatants of pacakaging cell lines generated 
with CeB gag-pol expression contruct were devoid of 
replication- competent retroviruses, they were found 
sporadically to transfer gag-pol genomes (example 9, Table 

4) (Cosset et al., 1995 J. Virol 69: 7430-7436). Because 
gag-pol -bsr_ constructs generated here by using the hCMV 
promoter had much less retroviral sequences homologous to 
the retroviral vector than the parental CeB construct (Fig. 

5) , they are less likely to give rise to gag-pol recombinant 
(GPR) viruses. Therefore, the most efficient gag-pol-bsr 
plasmids, hCMV+.intron and -CeB, were further analysed for 
emergence of GPR viruses. To assay for such recombinant 
retroviruses, we attempted to mobilise an lacZ provirus from 
an indicator cell lines which could cross -complement 
potential recombinant viruses carrying gag-pol functional 
genes. Results displayed in Table 6 showed that consistently 
with data reported previously (example 9, Table 4) (Cosset et 
al., 1995 Supra), lacZ retrovirus vectors generated by using 
CeB gag-pol construct were contaminated with GPR viruses. In 
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contrast lacZ retrovirus vectors generated by using 
hCMV+intron construct were completely devoid of such GPR 
viruses, suggesting that this construct was 'improved 
compared to CeB with respects with emergence of recombinant 
5 viruses . 



Table 6. Comparative study of gag-pol-bsr plasmids. 



plasmid 


input virus 
(lacZ i.u.)^ 


no of experiments 
giving titres of^ 


CeB 


5x10^ 


5 


3 


0 




5x10^ 


2 


4 


2 




5x10' 


0 


1 


7 


hCMV+intron 


5x10^ 


0 


0 


8 




5x10^ 


0 


0 


8 




5x10^ 


0 


0 


8 



4X10E4 cells of TEL/MOSAF in 24 wells were challenged with lacZ (A) of i.u. 
indicated in the table (a), and iactibated at 37*»C for 3 days. Cells were 
trypsinized and transferred into small flasks. Cell sup was .harvested on day 5 
after lacZ(A) challenge and plated on either TE571 (not shown) and 3T3 cells 
(b) . No lacZ was mobilized into TE671 at all. LacZ [A) from CMV-int 10 again 
did not rescue lacZ from TEL/MOSAF. 

Exanple 12 

25 Generic primers to detect D-type (Medstrand and Blomberg 

J.Virol. (1993) 67:6778-6787) , C-type (Shih et al . , J 
Virol. (1989) 63:64-75), human endogenous virus RTVL-H 
(Wilkinson et al., J.Virol. (1993) 67 :2981-2989) , ' by *RT-PCR 
were employed (Patience et al., supra). Primers to detect 

30 mouse endogenous VL30 element (Adams et al Mol . Cel .Biol . 

(1988) 8:2989-2998), and MFGnlslacZ RNA were designed and 
synthesized (TABLE X) . Overnight supernatants (in 4ml of 
culture medium) from 106 cells of GP+EAM12lacZ25 , FLYA4lacZ3 



15 



20 
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and TELCeBGFBASALF cells (Table 3) were harvested and* 
centrifuged in sucrose gradient as described previously 
(Patience et al. , J.Virol., 70:2654-2657). Fractions 
containing retrovirus particles were collected, and RNA 
extracted. One twentieth of the RNA preparation or 
dilution's thereof were applied to RT-PCR as described 
previously (Table X) . A 1/200 of RNA harvested from 
GP+EAM121ac225 cells was positive for VL3 0 RNA. MFGnlslacZ 
RNA was found from 1/20 of RNA from GP+EAM121acZ and 
TELCeB6FBASALF cells and 1/200 of RNA from FLYA4lacZ3 cells 
The primer combinations for RTVL-H, C- and D-type RNA did 
not give detectable PGR product. 



15 



Table 7. RT-PCR detection of endogenous retrovirus RNA 
associated with virus particles. 



rt-pcr of virion associated RNA from* 



20 



primer ( 5 ' - 3 ' ) GP+EAM12 FLYA4 TELCeBGF 

forward (F) /reverse (R) lac225 lacZ3 BASALF 



25 



MFGnls F) CTCTGGCTCACAGTACGACGTAG 

lacZ R) CCATCAATCCGGTAGGTTTTCCG 



30 



C-type F) CARRGKTTCAARAACWSYCCCAC 

R) AGYARVGTAGCNGGGTTHAGG 

D-type F) TCCCCTTGGAATACTCCTGTTTTyGT 

R) CATTCCTTGTGGTAAAACTTTCCAYTG 



35 



RTVL-H F) CCTCACCCTGATCACRYTTG 

R) GAATTATGTCTGACAGAAGGG 



NT 



VL30 



F) GTTGACATCTGCAGAGAAAGACC ++ 
R) TCTGAGGTCTGTACACACAATGG 



NT 



NT 
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10 



30 



a:-, not detected; + detected in 1/20 RNA preparation; ++ detected in 1/200 RNA 
preparation; NT, not tested because the cells do not possess the corresponding 
genes . 

EXAMPLE 13. - 



Generation of gag-pol pre-packaging cells by using TE671 
cells . 



CeB, a plasmid designed to over-express MoMLV gag and pol 
proteins was introduced in TE671 human rhabdomyosarcoma 
cells (ATCC CRL8805). After selection with blasticidin, 50 
bsr-positive' colonies were isolated and the RT (reverse 
transcriptase) activity was analysed in their supernatants . 
15 12 TE671-CeB (TECeB) clones with high RT activity were 

selected for further analysis. The best TECeB clone, clone 
#15, had a RT activity roughly equivalent to that TELCeB6 
cells (Cosset et al., J. Virol. 69:7430-7436 (1995); see 
also Example 7, Table 6 in this patent application) but 
20 displayed 2-3 fold more gag-precursors into cells as 

demonstrated in immunoblots by using anti-CA antibodies. The 
biological activity of gag-pol proteins expressed in the six 
best TECeB clones was further confirmed by their ability to 
produce infectious retroviruses in a complementation assay. 
25 A lacZ provirus was introduced into each of the TECeB clones 

by polyclonal cross-infection by using lacZ(RD114) helper- 
free retrovirus vectors. FBMOSALF, a MoMLV env expression 
plasmid (Cosset et al., J. Virol. 69:6314-6322), was then 
transfected in each of the TECeB-lacZ lines and in the 
TELCeB6 cell line for comparison. After selection with 
phleomycin, the titer of lacZ retrovirus vectors was 
determined in the supernantant of pJools of phleomycin- 
resistant colonies for each TECEB-lacZ-FBMQSALF lines. A 
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good correlation was found between gag-pol expression' into 
the TE-CeB clones (as determined by RT-assays and anti-gag 
iminunoblots) and their ability to release infectious lacZ ' 
particles. TE-CeBlS cells could release approximately the 
same number of lacZ particles when compared to TELCeBG cells 
although TELCeB6 cells had the advantage of being selected 
for lacZ expression (Cosset et al,, J. Virol. 69:7430-7436 
(1995)). TE-CeB15 cells were therefore used to derive 
retroviral packaging cell lines.' 

Construction of env-expression plasmids. 

A series of plasmid (Fig. 3) was designed to allow 
expression of different retroviral envelope genes (isolated 
from MoMLV, GALV -Gibbon Ape Leukemia Virus-, and MLV-lOAl) . 
FBdelPMOSAF (Fig. 3, nucleotide sequence in -Fig. 10) and 
FBdelPlOAlSAF, expressing ecotropic MoMLV or MLV-lOAl 
envelopes, were generated by replacing the Bglll/Clal 
fragment from FBdelPASAF (Cosset et al., J. Virol. 69:7 430- . 
7436 (1995).; see also Example 7, Fig. 2 and nucleotide 
sequence in Fig. 9) encompassing .most of the env gene and 
splice acceptor site with that of MoMLV (position 5407 to 
7679, Shinnik et al., 1981) or with that of MLV-lOAl " (Ott et 
al., J. Virol. 64:757-766 (1990)). 

Nucleotides 7514-7516 of GALV (Delassus et al.. Virology 
173:205-213 (1989)) were mutated by PCR-mediated mutagenesis 
to create a Clal site (AAG to CGA) , thereby introducing a 
conservative modification (a lysine (amino-acid 665 of GALV 
env precursor) to an arginine) . The BamHI/Clal fragment (nts 
4994 (Delassus et al . Virology 173:205-213 (-1989)) to 7517) 
was then sub-cloned into FBdelPASAF in which the Bglll/Clal 
encompassing most of the env gene and splice acceptor site 
had been removed. The resulting plasmid, expressing GALV 
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envelope glycoproteins, was named FBdelPGASAF (Fig. 3, 
nucleotide sequence in Fig. 11) . 

CMVlOAl was generated by inserting a Klenow enzyme-filled 
Eagl/Sall fragment from FBdelPlOAlSAF (encompassing lOAl MLV 
5 env gene and phleo selectable marker) into hCMV-G digested 

.with BamHI and filled with Klenow enzyme. The resulting 
plasmid, CMVlOAl (Fig. 3 and nucleotide sequence in Fig. 13) 
could express lOAl envelopes under control of the hCMV 
promoter and the phleo selectable marker by translation re- 
' 10 initiation. 

Generation of a multi-tropic set of TE671-based retroviral 
packaging lines . 

FBdelPRDSAF (Fig. 3, nucleotide sequence in Fig. 12), 

15 FBdelPASAF, FBdelPGASAF, FBdelPMOSAF and FBdelPlOAlSAF were 

independently introduced into cells of the TE-CeB15 pre- 
packaging line, expressing MoMLV gag-pol proteins. 
Transfected cells were phleomycin-selected and 15-20 phleo- ' 
resistant colonies were isolated for each env-expression 

20 plasmid transfected. 

Individual colonies were then analysed for expression of 
envelope glycoproteins by immunoblots on cell lysates by 
using antibodies against RD114 SU glycoproteins or against 
Rausher leukemia virus SU (to screen MoMLV, MLV-4070A and 

25 MLV-lOAl env-producer clones) or against GALV. The best env- 

producer colonies as determined in this assay were further 
analysed by a complementation assay after introducing a lacZ 
retroviral vector. LacZ pseudotypes released from the 
different packaging cell lines were titrated by using NIH 

30 3T3 cells or TE671 cells as target. Titers higher than 1x10'' 

lacZ i.u./ml were obtained for the best clones. Depending on 
the envelope specificities expressed in these cells,- the new 
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TE671-based retroviral packaging- ceil lines were named TE- 
FLYE, TE-FLYA, TE-FLYRD, TE-FLYlOAl, and TE-FLYGA and could 
express the MoMLV, MLV-4070A, RD114, MLV-lOAl, and GALV env 
genes, respectively. 

Assays for detecting replication-competent retroviruses 
(RCRs) were performed in the supernatants of these cells and 
were negative (less than 1/ml) , 

TE671 cells are very potent for transient expression 
resulting in more than 95% of cells expressing transgene 
three days after plasmid transfection (Hatziioannou and 
Cosset, unpublished data,, (1996)). The ability of retroviral' 
packaging cell lines to transiently produce retroviral 
vectors is of crucial importance for gene therapy where 
vectors carrying toxic gene have to be prepared. Transient 
expression of retroviral vectors was comparatively 
determined from cells of the TE-FLYA line and from the BING 
line (Pear et al., Proc Natl Acad Sci USA 90, 8392-5 
(1993)), a retroviral packaging cell line designed to 
transiently express retroviral vectors. Results (Table 8) 
showed that'TE-FLYA cells were more efficient for transient 
expression of a lacZ retroviral vector hence resulting in 
higher titers. 



Table 8 . Ccamparative study of transient production of lacZ 
vectors , 



packaging 
cell line 


cell number^ 


% transfected 
cells** 


transient 
titer^ 


BIKG 


281 


5.3 


2x10^ 


TE-FLYA 


117 


35 


1.3x10^ 



Ceils were transfected by MFGnislacZ retroviral vectors with calcium phosphate 
precipitation method and titers of of lacZ vectors (c) released in cell 
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supernatant were determined as lacZ i.u./inl at day 3 following transfection. The 
relative number of cells (a) (average per microscope field) and the % of 
transfected cells (b) determined after X-gal staining are shown. 

Retroviral vectors prepared from TE671-baseci packaging cell 
lines were analysed for their sensitivity to human- 
complement mediated inactivation. Experiments were conducted 
as previously described (Cosset et al., J. Virol. 69:7430- 
7436 (1995); see also Example 10 in this patent application) 
by using three human sera of individual donnors (Table 9). 
As expected MLV^A prepared from mouse 3T3 cells were highly 
sensitive to inactivation after 1 hr incubation witn sera. 
In contrast, titers of lacZ vectors produced from TE-FLYRD 
cells were 17 to 55% of control incubations, while titers of 
lacZ vectors from TE-FLYA cells were 1 to 30% of controls. 



Table 9, Human serum sensitivity of viruses produced from 
TE671-based packaging cell lines.. 



virus from: 


hu56* 


hu57» 


BTS* 


3T3/A 


<0.2, <0.2 


<0.2, <0.2 


<0.2, <0.2 


TE-FLYE 


15, 7.8 


16, 11 


48, 60 


TE-FLYA 


1, 0.6 


2.2, 7.1 


28, 19 


TE-FLYBD 


17, 22 


30, 44 


54, 63 



Three human f jresh seram samples were tested in duplicate; hu56 (A+} , hu57 {AB+) , 
BTS(AB+). (a) % control (average for FCS and opti-MEM treatment) is shown. 
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CLAIMS : 

1.. A recombinant expression vector comprising a gene of 
interest and a selectable marker gene, wherein the 
selectable marker gene is arranged downstream of the 
gene of interest and a st6p codon associated with the 
gene of interest is spaced from a start codon of said 
selectable marker gene at a distance which is 
sufficient to ensure that said selectable marker 
protein is expressed from the corresponding mRNA as a 
result of translation reinitiation. 

2. A recombinant expression vector according to claim 1 
wherein the vector is a viral vector. 

3 . A recombinant expression vector according to claim 2 
wherein the vector is a retroviral vector. 

4. A recombinant expression vector according to any one 
of claims 1 to 3 wherein- the^ gene of interest is 
included as part of a viral packaging construct. 

5. A recombinant expression vector according to any one 
of the preceding claims wherein the number of 
nucleotides in the space between the stop codon of the 
gene of interest and the start codon of the selectable 
marker is in the range of from 20 to 200 nucleotides. 

6. A recombinant expression vector according to claim 5 
wherein the number of nucleotides in the space between 
the stop codon of the gene of interest and the start 
codon of the selectable marker is in the range of from 
60 to 80 nucleotides. 

7. A process for producing a cell line in which a gene of 
interest is expressed, which process comprises: 

transforming host cells with an expression vector 
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according to any one of the claims 1 to 6; and 

selectable those cells where expression of the 
selection marker gene may be detected. 

8 . A process according to claim 7 wherein the host cell 
is a eukaryotic cell. 

9 . A' host cell transformed with a recombinant expression 
vector according to any one of the claims 1 to 6 . 

10 . A retroviral . packaging cell line comprising a host 
cell transformed with a first and a second recombinant 
expression vector,, said first recombinant expression 
vector having . a packaging- deficient construct 
comprising a viral gag-pol gene and a first selectable 
marker gene downstream thereof, and said second 
recombinant expression vector having a packaging- 
deficient construct comprising a viral env gene and a 
second selectable marker gene downstream thereof; 
wherein the., start codon of the first and second 
selectable markers are spaced from the stop codons of 
the viral gag-pol gene and the viral env gene 
respectively by a distance which ensures that said 
selectable marker protein is expressed ' from the 
corresponding mRNA as a result of translation 
reinitiation. 

11. A retroviral packaging cell line according to claim 10 
wherein the first selectable marker is a bsr 
selectable marker and the second selectable marker is 
a phleo selectable marker. 

12 . A retroviral packaging cell line according to any one 
of claims 10 or 11 wherein the packaging-deficient 
construct comprising the viral gag-pol gene and first 
selectable marker is the CeB (SEQ ID No 2) expression 
construct . 
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13. A retroviral packaging cell line- according to any one 
of claims 10 or 11 wherein the packaging-deficient 
construct comprising the viral env gene and second 
selectable marker is the FBdelPASAF (SEQ ID No 5) , 
the FBdelPMOSAF (SEQ ID No 6) , the FbdelPGASAF (SEQ ID 
No 7), the FbdelPRDSAF (SEQ ID No 8), the FbdelPXSAF 
, (Fig. 3), the FbdelPlOAlSAF (Fig. 3), or the 
FBdelPVSVGSAF (Fig. 3) expression construct, 

14 . A retroviral packaging cell line according to any one 
of claims 10 or 11 wherein the recombinant expression 
vector is a packaging-deficient retroviral helper 
constmct. 

15. A retroviral packaging cell line according to claim 14 
wherein the overlapping sequences between the genomes 
of the retroviral vector and the packaging-deficient 
construct is reduced by minimizing the extent of non- 
coding retroviral sequences in the packaging-deficient 
genome . 

16. A retroviral packaging cell line according to any one 
of claims 10 to 15 wherein the viral gag-pol gene and 
the selectable marker are expressed under the control 
of a non- retroviral promoter. 

17. A retroviral packaging cell line according to claim 16 
wherein the promoter is fused to rabbit beta-1 globin 
intron . 

18. A retroviral packaging cell line according to claim 16 
or claim 17 wherein the promoter is a hCMV promoter. 

19. A retroviral packaging cell line according- to any one 
of claims 16 to claim 18 wherein the viral gag-pol 
gene and the selectable marker is a hCMV+ intron (SEQ 
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ID No3) or a hCMV+intronkaSD (SEQ ID No 4) expression 
construct . 

20. A retroviral packaging cell line according to anyone 
of claims 10 to 15 wherein the viral env gene and the 
selectable marker are under the control of a non- 
retroviral promoter. 

21. A retroviral packaging cell line according to claim 2 0 
wherein the promoter is fused to rabbit beta-1 globin 
intron, 

22. A retroviral packaging cell line according to claim 20 
or claim 21 wherein the promoter is a hCMV promoter. 

23 . A retroviral packaging cell line according any one of 
claims 20 to 22 wherein the viral env gene and the 
selectable marker is a CMVlOAl (SEQ ID No 9) 
expression construct. 

24. A retroviral packaging cell line according to any one 
of claims 10 to 23 wherein the cell line is the HT1080 
line, the TE671 line, the 3T3 line, the 293 line or 
the MV-l-lU line.. 

25 . A retroviral packaging cell line according to anyone 
of claims 10 to 24 wherein the retroviral packaging 
cells comprises human HT1080 cells and express RD114 
envelopes. 

26. A retroviral packaging cell line according to anyone 
of claims 10 to 24 wherein the retroviral packaging 
cells comprises human TE671 cells and express RD114 
envelopes . 
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27. A process for producing a retroviral packaging cell 
line in which a gene of interest in expressed, which 

.process comprises: 

transforming host cells with a first and a second 
recombinant expression vector, said first recombinant 
expression vector having a packaging-deficient 
construct comprising a viral gag-pol gene and a first 
selectable marker gene downstream thereof, and said 
second recombinant expression vector having a 
packaging-deficient construct comprising a viral, env 
gene and a second selectable marker gene downstream 
thereof; wherein the start codon of the first and 
second selectable markers are spaced from the stop 
codons of the viral gag-pol gene and the viral env 
gene respectively by a distance which ensures that 
said selectable marker protein is expressed from the 
corresponding mRNA as a result of translation 
reinitiation; and 

selecting transformed cells which express said 
first, and/or second marker genes. 

28. A packaging deficient construct for use in a process 
according to claim 27, which expresses a viral gag-pol 
gene and a selectable marker wherein a start codon ■ of 
the selectable marker is spaced from a stop codon of 
the viral gag-pol gene by a distance which ensures 
that said selectable marker protein is expressed from 
the corresponding mRNA as a result of translation 
reinitiation. 

29. A packaging deficient construct for use in a process 
according to claim 27, which expresses a .viral env 
gene and a ■ selectable marker gene; wherein a start 
codon of the selectable marker is spaced from a stop 
codon of the viral env gene by a distance which 
ensures that said selectable marker protein is 
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expressed from the corresponding mRNA as a result of 
translation reinitiation. 
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.del psi 









7 1 


MLV-LTH 




Mo MLV GAG-POL 




BSR 










1 : . 
SA : 





poly A 



pot ogne.. TCT AGA CTG ACA TS G CG'C 
GTT CAA CGC TCT CAA AAC CCC TTA 
AAA ATA AGG TTA ACC CGC GAG GCC 
CCCTAA 

tccccttaattcttctcatgctcagaggggtcagtac 
tgcttcgcccggctccagtgcogcccagscggccacc 
A7€S AAA ACA TTT AAC ATT TCT... bsr gene 



Figure 1, Schematic structure of CeB expression vector 
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.del psi 



FB29 LTR 



ISD 





r] 


7 1 


r 




ENV 


PHLEO 


poly A 









SA 



env gene.. JAC GAG CCA TAG 
QQCQcc tagtgttgacaattaatcatcggcatagtata 
cggcatagtataatacgactcactataggagggccacc 
AT^ G CC AAG TTG ACC.phleo gene 



Figure 2. Schematic structure of FBdelPASF expression vector 
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F3delPASAF 



r329 LTR H 407QAMLVenv" 



Cla' Kasi 
I ▼ I V 

Mohieo H 



polyA 



Msd/Bg«i 



rBdelPMOSAF 



F3Z9LTR 



KasI 



Mo MLV env 



onleo h polyA 



Msd/Sglll 



F3deIPRDSAF 

V 



F329 LTR 



K2S! 



sc5c RDn4 env |^ onleo 



polyA 



Mscl/Nciel 



FBdeiPBSAF 



rS29 LTR 



reLV-3 env 



Kasi 

▼J V ▼ 

y -l onleo H 



polyA 



Msci/3an;HI 
FBdelPXSAF 



Cal 



KasI 

▼ I V ▼ 



F329 LTR H NZ3 MLV-X env [ H^il!£2J H P^'^^ 



Mscl/BgUI 



FBdeiPIOAISAF 



Qal 



r329 LTR 



Kasl 



;0A1 MLV env "[i^onieon- polyA 



Msd/3glll 



FBdeiPGASAF 



F329 LTR 



GALV env 



Kasl 



poiyA 



Msd/SamHI 



FBdelPVSVGSAF 



FE29 LTR VSV-G 



Msd/Bgfil 



Kasl 

▼J V T 

Dhieo T - 



poIyA 



CMV10A1SAF 



Cal 



hCMV 



Kasl 



10A1 MLV env 



Dhteo h poiyA 



so SA 

aamHl/£agl 



Figure 3. Schematic structure of env expression vectors 
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NGAGCTC AGGAC AGGTAGAAAGAATGAATAGAAC AATAAAAGAGACCCTTACTAAATTGA 6 0 
CCTTAGAGACTGGCTTAAAAGATTGGAGACGCCTCCTATCTCTGGCTTTGTTAAGAGCCA 120 
GAAATACGCCCAACCGTTTTCGGCTCACCCCATATGAAATCCTTTATGGGGGACCCCCCC 180 
CTTTGTCAACCTTGCTCAATTCCTTCTCCCCCTCCGATCCTAAGACTGATTTACAAGCCC 240 
GACT AAAAGGGCTGC AAGGCGTGC AGGCCCAAATCTGG ACACCCCTGGCCGAATTGTACC 300 
GGCCAGGACATCCACAAACTAGCCACCCATTTCAGGTGGGAGACTCCGTGTACGTCCGGC 3 60 
GGC ACCGCTCTCAAGGATTGGAGCCTCGTTGGAAGGGACCTTACATCGTCCTGCTGACCA 420 
CGCCCACCGCC ATAAAGGTTGACGGGATCGCCGCCTGGATTCACGCATCGCACGGC AAGG 480 
CAGCCCCAAAAACCCCTGGACCAGAAACTCCCAAAACCTGGAAGCTCCGCCGTTCGGAGA 540 
ACCCTCTTAAGATAAGACTCTCCCGTGTCTGACTGCTAATCCACCTTGTCCCTGTACTAA 600 
CCCAAAATGAAACTCCCAACAGGAATGGTCATTTTATGTAGCCTAATAAT AGTTCGGGCA 660 
GGGTTTGACGACCCCCGCAAGGCTATCGCATTAGTACAAAAACAACATGGTAAACCATGC 720 
GAATGCAGCGGAGGGCAGGTATCCGAGGCCCCACCGAACTCCATCCAACAGGTAACTTGC 7 30 
CCAGGCAAGACGGCCTACTTAATGACCAACCAAAAATGGAAATGCAGAGTCACTCCAAAA 840 
ATCTCACCTAGCGGGGGAGAACTCCAGAACTGCCCCTGTAACACTTTCCAGGACTCGATG 900 
C ACAGTTCTTGTT ATACTGAATACCGGCAATGCAGGCGAATTAATAAGACATACTAC ACG 9 60 
GCC ACCTTGCTT AAAATACGGTCTGGGAGCCTC AACGAGGTACAGAT ATT AC AAAACCCC 1020 
AATCAGCTCCTACAGTCCCCTTGTAGGGGCTCTATAAATCAGCCCGTTTGGTGGAGTGCC ■ 1080 
ACAGCCCCCATCC AT ATCTCCGATGGTGGAGGACCCCTCGAT ACTAAGAGAGTGTGGACA 1140 
GTCCAAAAAAGGCTAGAACAAATTCATAAGGCTATGACTCCTGAACTTCAATACCACCCC 1200 
TTAGCCCTGCCCAAAGTCAGAGATGACCTTAGCCTTGATGCACGGACTTTTGATATCCTG 1260 
AATA.CC ACTTTTAGGTTACTCC AGATGTCCAATTTTAGCCTTGCCCAAGATTGTTGGCTC 1320 
TGTTTAAAACTAGGTACCCCTACCCCTCTTGCGATACCCACTCCCTCTTTAACCTACTCC 1380 
CTAGCAGACTCCCTAGCGAATGCCTCCTGTCAGATTATACCTCCCCTCTTGGTTCAACCG 1440 
ATGC AGTTCTCC AACTCGTCCTGTTTATCTTCCCCTTTC ATT AACGAT ACGGAAC AAAT A 1500 
G ACTTAGGTGC AGTCACCTTT ACTAACTGC ACCTCTGT AGCC AATGTCAGTAGTCCTTTA 1560 
TGTGCCCTAAACGGGTCAGTCTTCCTCTGTGGAAATAACATGGCATACACCTATTTACCC 1620 
C AAAACTGGACC AGACTTTGCGTCC AAGCCTCCCTCCTCCCCGACATTGACATCAACCCG 1580 
GGGGATGAGCCAGTCCCCATTCCTGCCATTGATCATTATATACATAGACCTAAACGAGCT 1740' 
GTACAGTTCATCCCTTTACTAGCTGGACTGGGAATCACCGCAGCATTCACCACCGGAGCT 1800 
AC AGGCCT AGGTGTCTCCGTC ACCC AGT ATAC AAAATT ATCCCATC AGTT AATATCTG AT 1860 
GTCC AAGTCTTATCCGGTACC ATAC AAG ATTTACAAGACCAGGT AGACTCGTT AGCTG AA 1920 
GTAGTTCTCC AAAATAGGAGGGGACTGGACCTACTAACGGCAGAACAAGGAGGAATTTGT 1980 
TTAGCCTTACAAG AAAAATGCTGTTTTTATGCTAACAAGTCAGGAATTGTGAGAAACAAA 2040 
ATAAGAACCCTACAAGAAGAATTACAAAAACGCAGGGAAAGCCTGGCAACCAACCCTCTC 2100 
TGG'ACCGGGCTGCAGGGCTTTCTTCCGTACCTCCTACCTCTCCTGGGACCCCTACTCACC 2160. 
CTCCT ACTC ATACTAACCATTGGGCCATGCGTTTTCAGTCGCCTCATGGCCTTCATT AAT 2220 
GAT AGACTT AATGTTGTAC ATGCC ATGGTGCTGGCCC AGC AATACCAAGC ACTCAAAGCT 2280 
G AGGAAG AAGCTC AGGATTGAGCTTCCGGGACAAAAGC AGGGGGGAATGAGAAGTC AGAA 2340 
CCCCCC ACCTTTGCTACAT AAATAACCGCTTTC ATTTCGCTTCTGTAAAACGCTT ATGCG 2 400 
CCCCACCCTAGCCGGAAAGTCCCCAGCCGCTACGCAACCCGGGCCCCGAGTTGCATCAGC 2460 
CGTTCGCAACCCGGGCTCCGAGTTGCATCAGCCGAAAGAAACTTCATTTCCCAAGCTT 2518 



Fig. 4 
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Figure 6. CeB Sequence 8/22 1 

AATGAAAGAC CCCACCTGTA GGTTTGGCAA GCTAGCTTAA GTAACGCCAT TTTGCAAGGC - 60 

ATGGAAAAAT ACATAACTGA GAATAGAGAA GTTCAGATCA AGGTCAGGAA CAGATGGAAC 120 

AGCTGAATAT GGGCCAAACA GGATATCTGT GGTAAGCAGT TCCTGCCCCG GCTCAGGGCC 180 
AAGAACAGAT GGAACAGCTG AATATGGGCC AAACAGGATA TCTGTGGTAA GCAGTTCCTG " 240 

CCCCGGCTCA GGGCCAAGAA CAGATGGTCC CCAGATGCGG TCCAGCCCTC AGCAGTTTCT 300 

AGAGAACCAT CAGATGTTTC CAGGGTGCCC CAAGGACCTG AAATGACCCT GTGCCTTATT 360 

TGAACTAACC AATCAGTTCG CTTCTCGCTT CTGTTCGCGC GCTTCTGCTC CCCGAGCTCA 420 

ATAAAAGAGC CCACAACCCC TCACTCGGGG CGCCAGTCCT CCGATTGACT GAGTCGCCCG 480 

GGTACCCGTG TATCCAATAA ACCCTCTTGC AGTTGCATCC GACTTGTGGT CTCGCTGTTC 540 

CTTGGGAGGG TCTCCTCTGA GTGATTGACT ACCCGTCAGC GGGGGTCTTT CATTTGGGGG 600 

CTCGTCCGGG ATCGGGAGAC CCCTGCCCAG GGACCACCGA CCCACCACCG GGAGGTAAGC 660 

TGGAAGCTTC TGCAGCATCG TTCTGTGTTG TCTCTGTCTG 'ACTGTGTTTC TGTATTTGTC 720 

TGAGAATATG GGCCAGACTG TTACCACTCC . CTTAAGTTTG ACCTTAGGTC ACTGGAAAGA 780 

TGTCGAGCGG ATCGCTCACA ACCAGTCGGT AGATGTCAAG AAGAGACGTT GGGTTACCTT 840 

CTGCTCTGCA GAATGGCCAA CCTTTAACGT CGGATGGCCG CGAGACGGCA CCTTTAACCG 900 

AGACCTCATC ACCCAGGTTA AGATCAAGGT CTTTTCACCT GGCCCGCATG GACACCCAGA 960 

CCAGGTCCCC TACATCGTGA CCTGGGAAGC CTTGGCTTTT GACCCCCCTC , CCTGGGTCAA 1020 

GCCCTTTGTA CACCCTAAGC CTCCGCCTCC TCTTCCTCCA TCCGCCCCGT CTCTCCCCCT 1080 

TGAACCTCCT CGTTCGACCC CGCCTCGATC CTCCCTTTAT CCAGCCCTOV CTCCTTCTCT 1140' 

AGGCGCCAAA CCTAAACGTC AAGTTCTTTC TGACAGTGGG -GGGCCGCTCA TCGACCTACT 1200 

TACAGAAGAC CCCCCGCCTT ATAGGGACCC AAGACCACCC CCTTCCGACA GGGACGGAAA 1260 

TGGTGGAGAA GCGACCCCTG CGGGAGAGGC ACCGGACCCC TCCCCAATGG CATCTCGCCT 1320 

ACGTGGGAGA CGGGAGCCCC CTGTGGCCGA CTCCACTACC TCGCAGGCAT TCCCCCTCCG 1380 

CGCAGGAGGA AACGGACAGC TTCAATACTG GCCGTTCTCC TCTTCTGACC TTTACAACTG 1440 

GAAAAATAAT AACCCTTCTT TTTCTGAAGA TCCAGGTAAA CTGACAGCTC TGATCGAGTC 1500 

TGTTCTCATC ACCCATCAGC CCACCTGGGA CGACTGTCAG CAGCTGTTGG GGACTCTGCT 1560 

GACCGGAGAA GAAAAACAAC GGGTGCTCTT AGAGGCTAGA AAGGCGGTGC GGGGCGATGA 1620 

TGGGCGCCCC ACTCAACTGC CCAATGAAGT .CGATGCCGCT TTTCCCCTCG AGCGCCCAGA 1680 

CTGGGATTAC ■ ACCACCCAGG CAGGTAGGAA CCACCTAGTC CACTATCGCC AGTTGCTCCT 1740 

AGCGGGTCTC CAAAACGCGG GCAGAAGCCC CACCAATTTG GCCAAGGTAA AAGGAATAAC 1800 

ACAAGGGCCC AATGAGTCTC CCTCGGCCTT CCTAGAGAGA" CTTAAGGAAG CCTATCGCAG 18.60 

GTACACTCCT TATGACCCTG AGGACCCAGG GCAAGAAACT AATGTGTCTA TGTCTTTCAT 1920 

TTGGCAGTCT GCCCCAGACA TTGGGAGAAA GTTAGAGAGG TTAGAAGATT TAAAAAACAA 1980 

GACGCTTGGA GATTTGGTTA GAGAGGCAGA AAAGATCTTT AATAAACGAG AAACCCCGGA 2040 

AGAAAGAGAG GAACGTATCA GGAGAGAAAC AGAGGAAAAA GAAGAACGCC GTAGGACAGA 2100 

GGATGAGCAG AAAGAGAAAG AAAGAGATCG TAGGAGACAT AGAGAGATGA GCAAGCTATT 2160 

GGCCACTGTC GTTAGTGGAC AGAAACAGGA TAGACAGGGA GGAGAACGAA GGAGGTCCCA 2220 

ACTCGATCGC GACCAGTGTG CCTACTGCAA AGAAAAGGGG CACTG.GGCTA AAGATTGTCC 2280 

CAAGAAACCA CGAGGACCTC GGGGACCAAG ACCCCAGACC TCCCTCCTGA CCCTAGATGA 2340 

CTAGGGAGGT CAGGGTCAGG AGCCCCCCCC TGAACCCAGG ATAACCCTCA AAGTCGGGGG 2400 

GCAACCCGTC ACCTTCCTGG TAGATACTGG GGCCCAACAC TCCGTGCTGA CCCAAAATCC 2460 

TGGACCCCTA AGTGATAAGT CTGCCTGGGT CCAAGGGGCT AGTGGAGGAA AGCGGTATCG 2520 

CTGGACCACG GATCGCAAAG TACATCTAGC TACCGGTAAG GTCACCCACT CTTTCCTCCA 2580 

TGTACCAGAC TGTCCCTATC CTCTGTTAGG AAGAGATTTG CTGACTAAAC TAAAAGCCCA 2640 

AATCCACTTT GAGGGATCAG GAGCTCAGGT TATGGGACCA ATGGGGCAGC CCCTGCAAGT '2700 

GTTGACCCTA AATATAGAAG ATGAGCATCG GCTACATGAG ACCTCAAAAG AGCCAGATGT 2760 

TTCTCTAGGG TCCACATGGC TGTCTGATTT TCCTCAGGCC TGGGCGGAAA CCGGGGGCAT 2820 

GGGACTGGCA GTTCGCCAAG CTCCTCTGAT CATACCTCTG AAAGCAACCT CTACCCCCGT ' 2880 

GTCCATAAAA CAATACCCCA TGTCACAAGA AGCCAGACTG GGGATCAAGC ' CCCACATACA 2940 

GAGACTGTTG GACCAGGGAA TACTGGTACC CTGCCAGTCC CCCTGGAACA CGCCCCTGCT 3000 

ACCCGTTAAG AAACCAGGGA CTAATGATTA TAGGCCTGTC CAGGATCTGA GAGAAGTCAA 3060 
CAAGCGGGTG GAAGACATCC ACCCCACCGT GCCCAACCCT TACAACCTCT TGAGCGGGCT - 3120 

CCCACCGTCC CACCAGTGGT ACACTGTGCT TGATTTAAAG GATGCCTTTT TCTGCCTGAG 3180 

ACTCCACCCC ACCAGTCAGC CTCTCTTCGC CTTTGAGTGG AGAGATCCAG AGATGGGAAT 3240 

CTCAGGACAA TTGACCTGGA CCAGACTCCC ACAGGGTTTC AAAAACAGTC CCACCCTGTT 33 00 

TGATGAGGCA CTGCACAGAG ACCTAGCAGA CTTCCGGATC CAGCACCCAG ACTTGATCCT 33 60 

GCTACAGTAC GTGGATGACT TACTGCTGGC CGCCACTTCT GAGCTAGACT GCCAACAAGG 3420 

TACTCGGGCC CTGTTACAAA CCCTAGGGAA CCTCGGGTAT CGGGCCTCGG CCAAGAAAGC 3480 

CCAAATTTGC CAGAAACAGG TCAAGTATCT GGGGTATCTT CTAAAAGAGG GTCAGAGATG 3 540 

GCTGACTGAG GCCAGAAAAG AGACTGTGAT GGGGCAGCCT ACTCCGAAGA CCCCTCGACA 3 600 

ACTAAGGGAG TTCCTAGGGA CGGCAGGCTT CTGTCGCCTC TGGATCCCTG GGTTTGCAGA 3660 

AATGGCAGCC CCCTTGTACC CTCTCACCAA AACGGGGACT CTGTTTAATT GGGGCCCAGA 3720 

CCAACAAAAG GCCTATCAAG AAATCAAGCA AGCTCTTCTA ACTGCCCCAG CCCTGGGGTT 3780 

GCCAGATTTG ACTAAGCCCT TTGAACTCTT TGTCGACGAG AAGCAGGGCT ACGCCAAAGG 3 840 

TGTCCTAACG CAAAAACTGG GACCTTGGCG TCGGCCGGTG GCCTACCTGT CCAAAAAGCT 3900 

AGACCCAGTA GCAGCTGGGT GGCCCCCTTG CCTACGGATG GTAGCAGCCA TTGCCGTACT 3960 

GACAAAGGAT GCAGGCAAGC TAACCATGGG ACAGCCACTA GTCATTCTGG CCCCCCATGC 4020 

AGTAGAGGCA CTAGTCAAAC AACCCCCCGA CCGCTGGCTT TCCAACGCCC GGATGACTCA 4080 
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Figure 6, CeB Sequence 9/22 2 

CTATGAGGCC TTGCTTTTGG ACACGGACCG GGTCCAGTTC GGACCGGTGG TAGCCCTGAA 4140 

CCCGGCTACG CTGCTCCCAC TGCCTGAGGA AGGGCTGCAA GACAACTGCC TTGATATCCT " 4200 

GGCCGAAGCC CACGGAACCC GACCCGACCT AACGGACCAG CCGCTCCCAG ACGCCGACCA 4260 

CACCTGGTAC ACGGATGGAA GCAGTCTCTT- ACAAGAGGGA CAGCGTAAGG CGGGAGCTGC 4320 

GGTGACCACC GAGACCGAGG TAATCTGGGC TAAAGCCCTG CCAGCCGGGA CATCCGCTCA 4380 

GCGGGCTGAA CTGATAGCAC TCACCCAGGC CCTAAAGATG GCAGAAGGTA AGAAGCTAAA 4440 

TGTTTATACT GATAGCCGTT ATGCTTTTGC TACTGCCCAT ATCCATGGAG AAATATACAG 4500 

AAGGCGTGGG TTGCTCAC^T CAGAAGGCAA AGAGATCAAA AATAAAGACG AGATCTTGGC 4560 

CCTACTAAAA GCCCTCTTTC TGCCCAAAAG ACTTAGCATA ATCOITTGTC CAGGACATCA 4520 

AAAGGGACAC AGCGCCGAGG CTAGAGGCAA CCGGATGGCT GACCAAGCGG CCCGAAAGGC 4680 

AGCCATCACA GAGACTCCAG ACACCTCTAC CCTCCTCATA GAAkATTCAT CACCCTACAC 4740 
CTCAGAACAT TTTCATTACA CAGTGACTGA TATAAAGGAC CTAACCAAGT TGGGGGCCAT ' 4800 

TTATGATAAA ACAAAGAAGT ATTGGGTCTA CCAAGGAAAA CCTGTGATGC CTGACCAGTT 48 60 

TACTTTTGAA TTATTAGACT TTCTTCATCA GCTGACTCAC CtCAGCTTCT CAAAAATGAA 4920 

GGCTCTCCTA GAGAGAAGCC ACAGTCCCTA CTACATGCTG AACCGGGATC GAACACTCAA 4980 

AAATATCACT GAGACCTGCA AAGCTTGTGC ACAAGTCAAC GCCAGCAAGT CTGCCGTTAA 5040 

ACAGGGAACT AGGGTCCGCG GGCATCGGCC CGGCACTCAT TGGGAGATCG ATTTCACCGA 5100 

GATAAAGCCC GGATTGTATG GCTATAAATA TCTTCTAGTT TTTATAGATA CCTTTTCTGG 5160 • 

CTGGATAGAA GCCTTCCCAA CCAAGAAAGA AACCGCCAAG GTCGTAACCA AGAAGCTACT 5220 

AGAGGAGATC TTCCCCAGGT TCGGCATGCC TCAGGTATTG GGAACTGACA ATGGGCCTGC 5280 

CTTCGTCTCC AAGGTGAGTC AGACAGTGGC CGATCTGTTG GGGATTGATT GGAAATTACA 5340 

TTGTGCATAC AGACCCCAAA 'GCTCAGGCCA GGTAGAAAGA ATGAATAGAA CCATCAAGGA 5400 

GACTTTAACT AAATTAACGC TTGCAACTGG CTCTAGAGAC TGGGTGCTCC TACTCCCCTT 5460 

AGCCCTGTAC CGAGCCCGCA ACACGCCGGG CCCCCATGGC CTCACCCCAT ATGAGATCTT 5520 

ATATGGGGCA CCCCCGCCCC TTGTAAACTT CCCTGACCCT GACATGACAA GAGTTACTAA 5580 
CAGCCCCTCT CTCCAAGCTC ACTTAGAGGC TCTCTACTTA GTCCAGCACG AAGTCTGGAG . 5640 

ACCTCTGGCG GCAGCCTACC AAGAACAACT GGACCGACCG GTGGTACCTC ACCCTTACCG ' 5700 

AGTCGGCGAC ACAGTGTGGG TCCGCCGACA CCAGACTAAG AACCTAGAAC CTCGCTGGAA 5760 . 

AGGACCTTAC ACAGTCCTGC TGACCACCCC CACCGCCCTC AAAGTAGACG GCATCGCAGC 5820 

TTGGATACAC GCCGCCCACG TGAAGGCTGC CGACCCCGGG GGTGGACCAT CCTCTAGACT 5880" 

GACATGGCGC GTTCAACGCT CTCAAAACCC CTTAAAAATA AGGTTAACCC GCGAGGCCCC 5940 

CTAATCCCCT TAATTCTTCT GATGCTCAGA GGGGTCAGTA CTGCTTCGCC CGGCTCCAGT 6000 

GCGGCCCAGC CGGCCACCAT GAAAACATTT AACATTTCTC AACAAGATCT AGAATTAGTA 6060. 
GAAGTAGCGA CAGAGAAGAT TACAATGCTT TATGAGGATA ATAAACATCA TGTGGGAGCG . 6120. 

GCAATTCGTA CGAAAACAGG AGAAATCATT TCGGCAGTAC ATATTGAAGC GTATATAGGA 6180 • 

CGAGTAACTG TTTGTGCAGA AGCCATTGCG ATTGGTAGTG CAGTTTCGAA TGGACAAAAG 6240 

GATTTTGACA CGATTGTAGC TGTTAGACAC CCTTATTCTG ACGAAGTAGA TAGAAGTATT 6300 

CGAGTGGTAA GTCCTTGTGG TATGTGTAGG GAGTTGATTT CAGACTATGC ACCAGATTGT 63 60 

TTTGTGTTAA TAGAAATGAA TGGCAAGTTA GTCAAAACTA CGATTGAAGA ACTCATTCCA 6420 

CTCAAATATA CCCGAAATTA AAAGTTTTAC CACCAAGCTT ATCGATTAGT CCAATTTGTT 6480 

AAAGACAGGA TATCAGTGGT CCAGGCTCTA GTTTTGACTC AACAATATCA CCAGCTGAAG 6540 . 

CCTATAGAGT ACGAGCCATA GATAAAATAA AAGATTTTAT TTAGTCTCCA GAAAAAGGGG 6600 * 

GGAATGAAAG ACCCCACCTG TAGGTTTGGC AAGCTAGCTT AAGTAACGCC ATTTTGCAAG 6660 

GCATGGAAAA ATACATAACT GAGAATAGAG AAGTTCAGAT CAAGGTCAGG AACAGATGGA 6720 

ACAGTCGAGA ACTTGTTTAT TGCAGCTTAT AATGGTTACA AATAAAGCAA TAGCATCACA 6780 

AATTTCACAA ATAAAGCATT TTTTTCACTG CATTCTAGTT GTGGTTTGTC CAAACTCATC 6840 

AATGTATCTT ATCATGTCTG GATCCCCAGG AAGCTCCTCT GTGTCCTCAT AAACCCTAAC 6900 

CTCCTCTACT TGAGAGGACA TTCCAATCAT AGGCTGCCCA TCCACCCTCT GTGTCCTCCT 6960 

GTTAATTAGG TCACTTAACA AAAAGGAAAT TGGGTAGGGG TTTTTCACAG ACCGCTTTCT 7020 

AAGGGTAATT TTAAAATATC TGGGAAGTCC CTTCCACTGC TGTGTTCCAG AAGTGTTGGT 7080 

AAACAGCCCA CAAATGTCAA CAGCAGAAAC ATACAAGCTG TCAGCTTTGC ACAAGGGCCC 7140 

AACACCCTGC TCATCAAGAA GCACTGTGGT TGCTGTGTTA GTAATGTGCA AAACAGGAGG 7200 

CACATTTTCC CCACCTGTGT AGGTTCCAAA ATATCTAGTG TTTTCATTTT TACTTGGATC 7260 

AGGAACCCAG CACTCCACTG GATAAGCATT ATCCTTATCC AAAACAGCCT TGTGGTCAGT 7320 

GTTCATCTGC TGACTGTCAA CTGTAGCATT TTTTGGGGTT ACAGTTTGAG CAGGATATTT 73 80 

GGTCCTGTAG TTTGCTAACA CACCCTGCAG CTCCAAAGGT TCCCCACCAA CAGCAAAAAA 7440 

ATGAAAATTT GACCCTTGAA TGGGTTTTCC AGCACCATTT TCATGAGTTT TTTGTGTCCC . 7500 

TGAATGCAAG TTTAACATAG CAGTTACCCC AATAACCTCA GTTTTAACAG TAACAGCTTC 7560 

CCACATCAAA ATATTTCCAC AGGTTAAGTC CTCATTTAAA TTAGGCAAAG GAATTC 7616 
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AGATCTCCCG ATCCCCTATG GTCGACTCTC AGTAGAATCT GCTCTGATGC CGCATAGTTA - 60 

AGCCAGTATC TGCTCCCTGC TTGTGTGTTG GAGGTCGCTG AGTAGTGCGC GAGCAAAATT 120 

TAAGCTACAA CAAGGCAAGG CTTGACCGAC AATTGCATGA AGAATCTGCT TAGGGTTAGG 180 

CGTTTTGCGC TGCTTCGCGA TGTACGGGCC AGATATACGC GTTGACATTG ATTATTGACT 240 

AGTTATTAAT AGTAATCAAT TACGGGGTCA TTAGTTCATA GCCCATATAT GGAGTTCCGC 300 

GTTACATAAC TTACGGTAAA TGGCCCGCCT GGCTGACCGC CCAACGACCC CCGCCCATTG 360 

ACGTCAATAA TGACGTATGT TCCCATAGTA ACGCCAATAG GGACTTTCCA TTGACGTCAA 420 

TGGGTGGACT ATTTACGGTA AACTGCCCAC TTGGCAGTAC ATCAAGTGTA TCATATGCCA 480 

AGTACGCCCC CTATTGACGT CAATGACGGT AAATGGCCCG CCTGGCATTA TGCCCAGTAC 540 

ATGACCTTAT GGGACTTTCC TACTTGGCAG TACATCTACG TATTAGTCAT CGCTATTACC 600 

ATGGTGATGC GGTTTTGGCA GTACATCAAT GGGCGTGGAT AGCGGTTTGA CTCACGGGGA 660 

TTTCCAAGTC TCCACCCCAT TGACGTCAAT GGGAGTTTGT TTTGGCACCA AAATCAACGG 720 

GACTTTCCAA AATGTCGTAA CAACTCCGCC CCATTGACGC AAATGGGCGG TAGGCGTGTA 780 
CGGTGGGAGG TCTATATAAG CAGAGCTCTC TGGCTAACTA GAGAACCCAC TGCTTAACTG ' 840 

GCTTATCGAA ATGTCGACTG AGAACTTCAG GGTGAGTTTG GGGACCCTTG ATTGTTCTTT 900 

CTTTTTCGCT ATTGTAAAAT TCATGTTATA TGGAGGGGGC AAAGTTTTCA GGGTGTTGTT 960 

TAGAATGGGA AGATGTCCCT TGTATCACCA TGGACCCTCA TGATAATTTT GTTTCTTTCA 1020 

CTTTCTACTC TGTTGACAAC CATTGTCTCC TCTTATTTTC TTTTCATTTT CTGTAACTTT 1080 

TTCGTTAAAC TTTAGCTTGC ATTTGTAACG AATTTTTAAA TTCACTTTTG TTTATTTGTC 1140 

AGATTGTAAG TACTTTCTCT AATCACTTTT TTTTCAAGGC AATCAGGGTA TATTATATTG 1200 

TACTTCAGCA CAGTTTTAGA GAACAATTGT TATAATTAAA TGATAAGGTA^ GAATATTTCT 1260 

GCATATAAAT TCTGGCTGGC GTGGAAATAT TCTTATTGGT AGAAACAACT" ACATCCTGGT 1320 
CATCATCCTG CCTTTCTCTT TATGGTTACA ATGATATACA CTGTTTGAGA TGAGGATAAA' ' 1380 

ATACTCTGAG TCCAAACCGG GCCCCTCTGC TAACCATGTT CATGCCTTCT TCTTTTTCCT 1440 

ACAGCTCCTG GGCAACGTGC TGGTTGTTGT GCTGTCTCAT CATTTTGGCA AGAATTGGCC 1500 

GCAAGCTTCT GCAGCATCGT TCTGTGTTGT CTCTGTCTGA CTGTGTTTCT GTATTTGTCT 1560 

GAGAATATGG GCCAGACTGT TACCACTCCC TTAAGTTTGA CCTTAGGTCA CTGGAAAGAT 1620 

GTCGAGCGGA TCGCTCACAA CCAGTCGGTA GATGTCAAGA AGAGACGTTG GGTTACCTTC 1680 

TGCTCTGCAG AATGGCCAAC CTTTAACGTC GGATGGCCGC GAGACGGCAC CTTTAACCGA 1740 

GACCTCATCA CCCAGGTTAA GATCAAGGTC TTTTCACCTG GCCCGCATGG ACACCCAGAC 1800 

CAGGTCCCCT ACATCGTGAC CTGGGAAGCC TTGGCTTTTG ACCCCCCTCC CTGGGTCAAG 1860 

CCCTTTGTAC ACCCTAAGCC TCCGCCTCCT CTTCCTCCAT CCGCCCCGTC TCTCCCCCTT 1920 

GAACCTCCTC GTTCGACCCC GCCTCGATCC TCCCTTTATC CAGCCCTCAC TCCTTCTCTA 1980 

GGCGCCAAAC CTAAACCTCA AGTTCTTTCT GACAGTGGGG GGCCGCTCAT CGACCTACTT .2040 

ACAGAAGACC CCCCGCCTTA TAGGGACCCA AGACCACCCC CTTCCGACAG GGACGGAAAT 2100 

GGTGGAGAAG CGACCCCTGC GGGAGAGGCA CCGGACCCCT CCCCAATGGC ATCTCGCCTA 2160 

CGTGGGAGAC GGGAGCCCCC .TGTGGCCGAC TCCACTACCT CGCAGGCATT CCCCCTCCGC 2220 

GCAGGAGGAA ACGGACAGCT TCAATACTGG CCGTTCTCCT CTTCTGACCT TTACAACTGG 2280 

AAAAATAATA ACCCTTCTTT TTCTGAAGAT CCAGGTAAAC TGACAGCTCT GATCGAGTCT 2340 

GTTCTCATCA CCCATCAGCC CACCTGGGAC GACTGTCAGC AGCTGTTGGG GACTCTGCTG 2400 

ACCGGAGAAG AAAAACAACG GGTGCTCTTA GAGGCTAGAA AGGCGGTGCG GGGCGATGAT 2460 

GGGCGCCCCA CTCAACTGCC CAATGAAGTC GATGCCGCTT TTCCCCTCGA GCGCCCAGAC 2520 

TGGGATTACA CCACCCAGGC AGGTAGGAAC CACCTAGTCC ACTATCGCCA GTTGCTCCTA 2580 

GCGGGTCTCC AAAACGCGGG CAGAAGCCCC ACCAATTTGG CCAAGGTAAA AGGAATAACA 2 640 

CAAGGGCCCA ATGAGTCTCC CTCGGCCTTC CTAGAGAGAC TTAAGGAAGC CTATCGCAGG 2700 

TACACTCCTT ATGACCCTGA GGACCCAGGG CAAGAAACTA ATGTGTCTAT GTCTTTCATT 2760 

TGGCAGTCTG CCCCAGACAT TGGGAGAAAG TTAGAGAGGT TAGAAGATTT AAAAAACAAG 2820 

ACGCTTGGAG ATTTGGTTAG AGAGGCAGAA AAGATCTTTA ATAAACGAGA AACCCCGGAA 2 880 

GAAAGAGAGG AACGTATCAG GAGAGAAACA GAGGAAAAAG AAGAACGCCG TAGGACAGAG 2940 

GATGAGCAGA AAGAGAAAGA AAGAGATCGT AGGAGACATA GAGAGATGAG CAAGCTATTG 3000 

GCCACTGTCG TTAGTGGACA GAAACAGGAT AGACAGGGAG GAGAACGAAG GAGGTCCCAA 3060 

CTCGATCGCG ACCAGTGTGC CTACTGCAAA GAAAAGGGGC ACTGGGCTAA AGATTGTCCC 3120 

AAGAAACCAC GAGGACCTCG GGGACCAAGA CCCCAGACCT CCCTCCTGAC CCTAGATGAC 3180 

TAGGGAGGTC AGGGTCAGGA GCCCCCCCCT GAACCCAGGA TAACCCTCAA AGTCGGGGGG 3240 

CAACCCGTCA CCTTCCTGGT AGATACTGGG GCCCAACACT CCGTGCTGAC CCAAAATCCT 3300 

GGACCCCTAA GTGATAAGTC TGCCTGGGTC CAAGGGGCTA CTGGAGGAAA GCGGTATCGC 33 60 

TGGACCACGG ATCGCAAAGT ACATCTAGCT ACCGGTAAGG TCACCCACTC TTTCCTCCAT 3420 

GTACCAGACT GTCCCTATCC TCTGTTAGGA AGAGATTTGC TGACTAAACT AAAAGCCCAA 3480 

ATCCACTTTG AGGGATCAGG AGCTCAGGTT ATGGGACCAA TGGGGCAGCC- CCTGCAAGTG 3540 

TTGACCCTAA ATATAGAAGA TGAGCATCGG CTACATGAGA CCTCAAAAGA GCCAGATGTT 3600 

TCTCTAGGGT CCACATGGCT GTCTGATTTT CCTCAGGCCT GGGCGGAAAC CGGGGGCATG 3 660 

GGACTGGCAG TTCGCCAAGC TCCTCTGATC ATACCTCTGA AAGCAACCTC TACCCCCGTG 3720 

TCCATAAAAC AATACCCCAT GTCACAAGAA GCCAGACTGG GGATCAAGCC CCACATACAG 3780 

AGACTGTTGG ACCAGGGAAT ACTGGTACCC TGCCAGTCCC CCTGGAACAC GCCCCTGCTA 3840 

CCCGTTAAGA AACCAGGGAC TAATGATTAT AGGCCTGTCC AGGATCTGAG AGAAGTCAAC 3 900 

AAGCGGGTGG AAGACATCCA CCCCACCGTG CCCAACCCTT ACAACCTCTT GAGCGGGCTC 39 60 

CCACCGTCCC ACCAGTGGTA CACTGTGCTT GATTTAAAGG ATGCCTTTTT CTGCCTGAGA 4020 

CTCCACCCCA CCAGTCAGCC TCTCTTCGCC TTTGAGTGGA GAGATCCAGA GATGGGAATC 4080 
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TCAGGACAAT TGACCTGGAC CAGACTCCCA CAGGGTTTCA AAAACAGTCC CACCCTGTTT ' 4140 

GATGAGGCAC TGCACAGAGA CCTAGCAGAC TTCCGGATCC AGCACCCAGA CTTGATCCTG 42 00 

CTACAGTACG TGGATGACTT ACTGCTGGCC GCCACTTCTG AGCTAGACTG CCAACAAGGT 4260 

ACTCGGGCCC TGTTACAAAC CCTAGGGAAC CTCGGGTATC GGGCCTCGGC CAAGAAAGCC 4320 

CAAATTTGCC AGAAACAGGT CAAGTATCTG GGGTATCTTC TAAAAGAGGG TCAGAGATGG 43 80 

CTGACTGAGG CCAGAAAAGA GACTGTGATG GGGCAGCCTA CTCCGAAGAC CCCTCGACAA 4440 

CTAAGGGAGT TCCTAGGGAC GGCAGGCTTC TGTCGCCTCT GGATCCCTGG GTTTGCAGAA 4500 

ATGGCAGCCC CCTTGTACCC TCTCACCAAA ACGGGGACTC TGTTTAATTG GGGCCCAGAC 4560 

CAACAAAAGG CCTATCAAGA AATCAAGCAA GCTCTTCTAA CTGCCCCAGC CCTGGGGTTG 4620 

CCAGATTTGA CTAAGCCCTT TGAACTCTTT GTCGACGAGA AGCAGGGCTA CGCCAAAGGT 4680 

GTCCTAACGC AAAAACTGGG ACCTTGGCGT CGGCCGGTGG CCTACCTGTC CAAAAAGCTA 4740 

GACCCAGTAG CAGCTGGGTG GCCCCCTTGC CTACGGATGG TAGCAGCCAT TGCCGTACTG 4800 

ACAAAGGATG CAGGCAAGCT AACCATGGGA CAGCCACTAG TCATTCTGGC CCCCCATGCA 4860 

GTAGAGGCAC TAGTCAAACA ACCCCCCGAC CGCTGGCTTT CCAACGCCCG GATGACTCAC -4920 

TATCAGGCCT TGCTTTTGGA CACGGACCGG GTCCAGTTCG GACCGGTGGT AGCCCTGAAC 4980 

CCGGCTACGC TGCTCCCACT GCCTGAGGAA GGGCTGCAAC ACAACTGCCT TGATATCCTG 5040 

GCCGAAGCCC ACGGAACCCG ACCCGACCTA ACGGACCAGC CGCTCCCAGA CGCCGACCAC 5100 

ACCTGGTACA CGGATGGAAG CAGTCTCTTA CAAGAGGGAC AGCGTAAGGC GGGAGCTGCG 5160 
GTGACCACCG AGACCGAGGT AATCTGGGCT AAAGCCCTGC CAGCCGGGAC ATCCGCTCAG ' 5220 

CGGGCTGAAC TGATAGCACT CACCCAGGCC CTAAAGATGG CAGAAGGTAA GAAGCTAAAT 5280 

GTTTATACTG ATAGCCGTTA TGCTTTTGCT ACTGCCCATA TCCATGGAGA AATATACAGA 5340 

AGGCGTGGGT TGCTCACATC AGAAGGCAAA GAGATCAAAA ATAAAGACGA GATCTTGGCC 5400 

CTACTAAAAG CCCTCTTTCT GCCCAAAAGA CTTAGCATAA TCCATTGTCC AGGACATCAA ' 5460 

AAGGGACACA GCGCCGAGGC TAGAGGCAAC CGGATGGCTG ACCAAGCGGC CCGAAAGGCA . 5520 

GCCATCACAG AGACTCCAGA CACCTCTACC CTCCTCATAG AAAATTCATC ACCCTACACC. 5580 

TCAGAACATT TTCATTACAC AGTGACTGAT ATAAAGGACC TAACCAAGTT GGGGGCCATT 5640 

TATGATAAAA CAAAGAAGTA TTGGGTCTAC CAAGGAAAAC CTGTGATGCC TGACCAGTTT ^ 5700 

ACTTTTGAAT TATTAGACTT TCTTCATCAG CTGACTCACC TCAGCTTCTC AAAAATGAAG 5 7 '60 

GCTCTCCTAG AGAGAAGCCA CAGTCCCTAC TACATGCTGA ACCGGGATCG. AACACTCAAA- 5820 

AATATCACTG AGACCTGCAA AGCTTGTGCA CAAGTCAACG CCAGCAAGTC TGCCGTTAAA 5880 

CAGGGAACTA GGGTCCGCGG GCATCGGCCC GGCACTCATT GGGAGATCGA TTTCACCGAG 5940 

ATAAAGCCCG GATTGTATGG CTATAAATAT CTTCTAGTTT TTATAGATAC CTTTTCTGGC 6000 
TGGATAGAAG CCTTCCCAAC. CAAGAAAGAA ACCGCCAAGG TCGTAACCAA GAAGCTACTA . 6060 • 

GAGGAGATCT TCCCCAGGTT CGGCATGCCT CAGGTATTGG GAACTGACAA- TGGGCCTGCC 6120 

TTCGTCTCCA AGGTGAGTCA GACAGTGGCC GATCTGTTGG GGATTGATTG GAAATTACAT 6180 
TGTGCATACA GACCCCAAAG CTCAGGCCAG GTAGAAAGAA TGAATAGAAC CATCAAGGAG . 6240 

ACTTTAACTA AATTAACGCT TGCAACTGGC TCTAGAGACT GGGTGCTCCT ACTCCCCTTA • 6300 

GCCCTGTACC GAGCCCGCAA CACGCCGGGC CCCCATGGCC TCACCCCATA TGAGATCTTA 6360' 

TATGGGGCAC CCCCGCCCCT TGTAAACTTC CCTGACCCTG ACATGACAAG- AGTTACTAAC 6420 

AGCCCCTCTC TCCAAGCTCA CTTACAGGCT CTCTACTTAG TCCAGCACGA AGTCTGGAGA 6480 

CCTCTGGCGG CAGCCTACCA AGAACAACTG GACCGACCGG TGGTACCTCA CCCTTACCGA 6540 

GTCGGCGACA CAGTGTGGGT CCGCCGACAC CAGACTAAGA ACCTAGAACC TCGCTGGAAA ' 6600 

GGACCTTACA CAGTCCTGCT GACCACCCCC ACCGCCCTCA AAGTAGACGG CATCGCAGCT 6660 

TGGATACACG CCGCCCACGT GAAGGCTGCC GACCCCGGGG GTGGACCATC CTCTAGACTG 6720 
ACATGGCGCG TTCAACGCTC TCAAAACCCC TTAAAAATAA GGTTAACCCG CGAGGCCCCC * 6780 

TAATCCCCTT AATTCTTCTG ATGCTCAGAG GGGTCAGTAC TGCTTCGCCC GGCTCCAGTG 6840 
CGGCCCAGCC GGCCACCATG AAAACATTTA ACATTTCTCA ACAAGATCTA GAATTAGTAG ■ 6900 

AAGTAGCGAC AGAGAAGATT ACAATGCTTT ' ATGAGGATAA TAAACATCAT GTGGGAGCGG 6960 

CAATTCGTAC GAAAACAGGA GAAATCATTT CGGCAGTACA TATTGAAGCG TATATAGGAC 7020 

GAGTAACTGT TTGTGCAGAA GCCATTGCGA TTGGTAGTGC AGTTTCGAAT GGACAAAAGG 7080 

ATTTTGACAC GATTGTAGCT GTTAGACACC CTTATTCTGA CGAAGTAGAT AGAAGTATTC 7140 

GAGTGGTAAG TCCTTGTGGT ATGTGTAGGG AGTTGATTTC AGACTATGCA CCAGATTGTT 7200 

TTGTGTTAAT AGAAATGAAT GGCAAGTTAG TCAAAACTAC GATTGAAGAA CTCATTCCAC 72 50 

TCAAATATAC CCGAAATTAA AAGTTTTACC ACCAAGCTTA TCGAATTC 7308 
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AGATCTCCCG ATCCCCTATG GTCGACTCTC AGTACAATCT GCTCTGATGC CGCATAGTTA . 60 

AGCCAGTATC TGCTCCCTGC TTGTGTGTTG GAGGTCGCTG AGTAGTGCGC GAGCAAAATT 120 

TAAGCTACAA CAAGGCAAGG CTTGACCGAC AATTGCATGA AGAATCTGCT TAGGGTTAGG 180 

CGTTTTGCGC TGCTTCGCGA TGTACGGGCC AGATATACGC GTTGACATTG ATTATTGACT 240 

AGTTATTAAT AGTAATCAAT TACGGGGTCA TTAGTTCATA GCCCATATAT GGAGTTCCGC " 300 

GTTACATAAC TTACGGTAAA TGGCCCGCCT GGCTGACCGC CCAACGACCC CCGCCCATTG 360 

ACGTCAATAA TGACGTATGT TCCCATAGTA ACGCCAATAG GGACTTTCCA TTGACGTCAA 420 

TGGGTGGACT ATTTACGGTA AACTGCCCAC TTGGCAGTAC ATCAAGTGTA TCATATGCCA 480 

AGTACGCCCC CTATTGACGT CAATGACGGT AAATGGCCCG CCTGGCATTA TGCCCAGTAC 540 

ATGACCTTAT GGGACTTTCC TACTTGGCAG TACATCTACG TATTAGTCAT CGCTATTACC 600 

ATGGTGATGC GGTTTTGGCA GTACATCAAT GGGCGTGGAT AGCGGTTTGA CTCACGGGGA . 660 

TTTCCAAGTC TCCACCCCAT TGACGTCAAT GGGAGTTTGT TTTGGCACCA AAATCAACGG 720 

GACTTTCCAA AATGTCGTAA CAACTCCGCC CCATTGACGC AAATGGGCGG TAGGCGTGTA 780 

CGGTGGGAGG TCTATATAAG CAGAGCTCTC TGGCTAACTA GAGAACCCAC TGCTTAACTG 840 

GCTTATCGAA ATGTCGACTG AGAACTTCAG GGTGAGTTTG GGGACCCTTG ATTGTTCTTT 900 

CTTTTTCGCT ATTGTAAAAT TCATGTTATA TGGAGGGGGC AAAGTTTTCA GGGTGTTGTT 9 60 

■ TAGAATGGGA AGATGTCCCT TGTATCACCA TGGACCCTCA TGATAATTTT GTTTCTTTCA 1020 

CTTTCTACTC TGTTGACAAC CATTGTCTCC TCTTATTTTC TTTTCATTTT CTGTAACTTT 1080 

TTCGTTAAAC TTTAGCTTGC ATTTGTAACG AATTTTTAAA TTCACTTTTG TTTATTTGTC 1140 

AGATTGTAAG TACTTTCTCT AATCACTTTT TTTTCAAGGC AATCAGGGTA TATTATATTG 1200 

TACTTCAGCA CAGTTTTAGA GAACAATTGT TATAATTAAA TGATAAGGTA GAATATTTCT 12 60 

GCATATAAAT TCTGGCTGGC GTGGAAATAT TCTTATTGGT AGAAACAACT ACATCCTGGT 1320 

CATCATCCTG CCTTTCTCTT TATGGTTACA ATGATATACA CTGTTTGAGA TGAGGATAAA 1380 

ATACTCTGAG TCCAAACCGG GCCCCTCTGC TAACCATGTT CATGCCTTCT TCTTTTTCCT 1440 

ACAGCTCCTG GGCAACGTGC TGGTTGTTGT GCTGTCTCAT CATTTTGGCA AGAATTGGCC 1500 

GCAAGCTTCT GCAGCATCGT TCTGTGTTGT CTCTGTCTGA CTGTGTTTCT GTATTTGTCT. 1560 
GAGAATATGG GCCAGACTGT TACCACTCCC TTAAGTTTGA CCTTAGGTCA CTGGAAAGAT " ' 1620 

GTCGAGCGGA TCGCTCACAA CCAGTCGGTA GATGTCAAGA AGAGACGTTG GGTTACCTTC , " '1680 ' 

TGCTCTGCAG AATGGCCAAC CTTTAACGTC GGATGGCCGC GAGACGGCAC CTTTAACCGA 1740 
GACCTCATCA CCCAGGTTAA GATCAAGGTC TTTTCACCTG GCCCGCATGG ACACCCAGAC 1800 

CAGGTCCCCT ACATCGTGAC CTGGGAAGCC TTGGCTTTTG ACCCCCCTCC CTGGGTCAAG 1860 
CCCTTTGTAC ACCCTAAGCC TCCGCCTCCT CTTCCTCCAT CCGCCCCGTC TCTCCCCCTT . 1920 . 

GAACCTCCTC GTTCGACCCC GCCTCGATCC TCCCTTTATC CAGCCCTCAC TCCTTCTCTA 1980 

GGCGCCAAAC CTAAACCTCA AGTTCTTTCT GACAGTGGGG GGCCGCTCAT CGACCTACTT 2040 

.ACAGAAGACC CCCCGCCTTA TAGGGACCCA AGACCACCCC CTTCCGACAG GGACGGAAAT"' 2100 ' 

GGTGGAGAAG CGACCCCTGC GGGAGAGGCA CCGGACCCCT CCCCAATGGC ATCTCGCCTA 2160 , 

CGTGGGAGAC GGGAGCCCCC TGTGGCCGAC TCCACTACCT CGCAGGCATT CCCCCTCCGC 2220 

GCAGGAGGAA ACGGACAGCT TCAATACTGG CCGTTCTCCT CTTCTGACCT TTACAACTGG 2280 

AAAAATAATA ACCCTTCTTT TTCTGAAGAT CCAGGTAAAC TGACAGCTCT GATCGAGTCT 2340 

GTTCTCATCA CCCATCAGCC CACCTGGGAC GACTGTCAGC AGCTGTTGGG GACTCTGCTG 2400 
ACCGGAGAAG AAAAACAACG GGTGCTCTTA GAGGCTAGAA AGGCGGTGCG GGGCGATGAT . 2460 

GGGCGCCCCA CTCAACTGCC CAATGAAGTC GATGCCGCTT TTCCCCTCGA GCGCCCAGAC 2520 

TGGGATTACA CCACCCAGGC AGGACGCAAC CACCTAGTCC ACTATCGCCA GTTGCTCCTA 2580 

GCGGGTCTCC AAAACGCGGG CAGAAGCCCC ACCAATTTGG CCAAGGTAAA AGGAATAACA 2640 

CAAGGGCCCA ATGAGTCTCC CTCGGCCTTC CTAGAGAGAC TTAAGGAAGC CTATCGCAGG 2700 

TACACTCCTT ATGACCCTGA GGACCCAGGG' CAAGAAACTA ATGTGTCTAT GTCTTTCATT 2760 

TGGCAGTCTG CCCCAGACAT TGGGAGAAAG TTAGAGAGGT TAGAAGATTT AAAAAACAAG 2820 

ACGCTTGGAG ATTTGGTTAG AGAGGCAGAA AAGATCTTTA ATAAACGAGA AACCCCGGAA 2880 

GAAAGAGAGG AACGTATCAG GAGAGAAACA GAGGAAAAAG AAGAACGCCG . TAGGACAGAG 2940 

GATGAGCAGA AAGAGAAAGA AAGAGATCGT AGGAGACATA GAGAGATGAG CAAGCTATTG 3 000 

GCCACTGTCG TTAGTGGACA GAAACAGGAT AGACAGGGAG GAGAACGAAG GAGGTCCCAA 3060 

CTCGATCGCG ACCAGTGTGC CTACTGCAAA GAAAAGGGGC ACTGGGCTAA AGATTGTCCC 3120 

AAGAAACCAC GAGGACCTCG GGGACCAAGA CCCCAGACCT CCCTCCTGAC CCTAGATGAC 3180 

TAGGGAGGTC AGGGTCAGGA GCCCCCCCCT GAACCCAGGA TAACCCTCAA AGTCGGGGGG 3240 

CAACCCGTCA CCTTCCTGGT AGATACTGGG GCCCAACACT CCGTGCTGAC CCAAAATCCT 3300 

GGACCCCTAA GTGATAAGTC TGCCTGGGTC CAAGGGGCTA CTGGAGGAAA GCGGTATCGC 33 60 

TGGACCACGG ATCGCAAAGT ACATCTAGCT ACCGGTAAGG TCACCCACTC TTTCCTCCAT 3420 

GTACCAGACT GTCCCTATCC TCTGTTAGGA AGAGATTTGC TGACTAAACT AAAAGCCCAA 3480 

ATCCACTTTG AGGGATCAGG AGCTCAGGTT ATGGGACCAA TGGGGCAGCC CCTGCAAGTG 3540 

TTGACCCTAA ATATAGAAGA TGAGCATCGG CTACATGAGA CCTCAAAAGA GCCAGATGTT 3 600 

TCTCTAGGGT CCACATGGCT GTCTGATTTT CCTCAGGCCT GGGCGGAAAC CGGGGGCATG 3 660 

GGACTGGCAG TTCGCCAAGC TCCTCTGATC ATACCTCTGA AAGCAACCTC TACCCCCGTG 3720 

TCCATAAAAC AATACCCCAT GTCACAAGAA GCCAGACTGG GGATCAAGCC CCACATACAG 3780 

AGACTGTTGG ACCAGGGAAT ACTGGTACCC TGCCAGTCCC CCTGGAACAC GCCCCTGCTA 3 840 

CCCGTTAAGA AACCAGGGAC TAATGATTAT AGGCCTGTCC AGGATCTGAG AGAAGTCAAC 3 900 

AAGCGGGTGG AAGACATCCA CCCCACCGTG CCCAACCCTT ACAACCTCTT GAGCGGGCTC 3960 

CCACCGTCCC ACCAGTGGTA CACTGTGCTT GATTTAAAGG ATGCCTTTTT CTGCCTGAGA 4020 

CTCCACCCCA CCAGTCAGCC TCTCTTCGCC TTTGAGTGGA GAGATCCAGA GATGGGAATC 4080 
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TCAGGACAAT 
GATGAGGCAC 
CTACAGTACG 
ACTCGGGCCC 
CAAATTTGCC 
■ CTGACTGAGG 
CTAAGGGAGT 
ATGGCAGCCC 
CAACAAAAGG 
CCAGATTTGA 
GTCCTAACGC 
GACCCAGTAG 
ACAAAGGATG 
GTAGAGGCAC 
TATCAGGCCT 
CCGGCTACGC 
GCCGAAGCCC 
ACCTGGTACA 
GTGACCACCG 
CGGGCTGAAC 
GTTTATACTG 
AGGCGTGGGT 
CTACTAAAAG 
AAGGGACACA 
GCCATCACAG 
TCAGAACATT 
TATGATAAAA 
ACTTTTGAAT 
GCTCTCCTAG 
AATATCACTG 
CAGGGAACTA 
ATAAAGCCCG 
TGGATAGAAG 
GAGGAGATCT 
TTCGTCTCCA 
TGTGCATACA 
ACTTTAACTA 
GCCCTGTACC 
TATGGGGCAC 
AGCCCCTCTC 
CCTCTGGCGG 
GTCGGCGACA 
GGACCTTACA. 
TGGATACACG 
ACATGGCGCG 
TAATCCCCTT 
CGGCCCAGCC 
AAGTAGCGAC 
CAATTCGTAC 
GAGTAACTGT 
ATTTTGACAC 
GAGTGGTAAG 
TTGTGTTAAT 
TCAAATATAC 



TGACCTGGAC 
TGCACAGAGA 
TGGATGACTT 
TGTTACAAAC 
AGAAACAGGT 
CCAGAAAAGA 
TCCTAGGGAC 
CCTTGTACCC 
CCTATCAAGA 
CTAAGCCCTT 
AAAAACTGGG 
CAGCTGGGTG 
CAGGCAAGCT 
TAGTCAAACA 
TGCTTTTGGA 
TGCTCCCACT 
ACGGAACCCG 
CGGATGGAAG 
AGACCGAGGT 
TGATAGCACT 
ATAGCCGTTA 
TGCTCACATC 
CCCTCTTTCT 
GCGCCGAGGC 
AGACTCCAGA 
TTCATTACAC 
GAAAGAAGTA 
TATTAGACTT 
AGAGAAGCCA 
AGACCTGCAA 
GGGTCCGCGG 
GATTGTATGG 
CCTTCCCAAC 
TCCCCAGGTT 
AGGTGAGTCA 
GACCCCAAAG 
AATTAACGCT 
GAGCCCGCAA 
CCCCGCCCCT 
TCCAAGCTCA 
CAGCCTACCA 
CAGTGTGGGT 
CAGTCCTGCT 
CCGCCCACGT 
TTCAACGCTC 
AATTCTTCTG 
GGCCACCATG 
AGAGAAGATT 
GAAAACAGGA 
TTGTGCAGAA 
GATTGTAGCT 
TCCTTGTGGT 
AGAAATGAAT 
CCGAAATTAA 



CAGACTCCCA 

CCTAGCAGAC 

ACTGCTGGCC 

CCTAGGGAAC 

CAAGTATCTG 

GACTGTGATG 

GGCAGGCTTC 

TCTCACCAAA 

AATCAAGCAA 

TGAACTCTTT 

ACCTTGGCGT 

GCCCCCTTGC 

AACCATGGGA 

ACCCCCCGAC 

CACGGACCGG 

GCCTGAGGAA 

ACCCGACCTA 

CAGTCTCTTA 

AATCTGGGCT 

CACCCAGGCC 

TGCTTTTGCT 

AGAAGGCAAA 

GCCCAAAAGA 

TAGAGGCAAC 

CACCTCTACC 

AGTGACTGAT 

TTGGGTCTAC 

TCTTCATCAG 

CAGTCCCTAC 

AGCTTGTGCA 

GCATCGGCCC 

CTATAAATAT 

CAAGAAAGAA 

CGGCATGCCT 

GACAGTGGCe- 

CTCAGGCCAG 

TGCAACTGGC 

CACGCCGGGC 

TGTAAACTTC 

CTTACAGGCT 

AGAACAACTG 

CCGCCGACAC 

GACCACCCCC 

GAAGGCTGCC 

TCAAAACCCC 

ATGCTCAGAG 

AAAACATTTA 

ACAATGCTTT 

GAAATCATTT 

GCCATTGCGA 

GTTAGACACC 

ATGTGTAGGG 

GGCAAGTTAG 

AAGTTTTACC 



CAGGGTTTCA 

TTCCGGATCC 

GCCACTTCTG 

CTCGGGTATC 

GGGTATCTTC 

GGGCAGCCTA 

TGTCGCCTCT 

ACGGGGACTC 

GCTCTTCTAA 

GTCGACGAGA 

CGGCCGGTGG 

CTACGGATGG 

CAGCCACTAG • 

CGCTGGCTTT 

GTCCAGTTCG 

GGGCTGCAAC 

ACGGACCAGC 

CAAGAGGGAC 

AAAGCCCTGC 

CTAAAGATGG 

ACTGCCCATA 

GAGATCAAAA 

CTTAGCATAA 

CGGATGGCTG 

CTCCTCATAG 

ATAAAGGACC 

CAAGGAAAAC 

CTGACTCACC 

TACATGCTGA 

CAAGTCAACG 

GGCACTCATT 

CTTCTAGTTT 

ACCGCCAAGG 

CAGGTATTGG 

GATCTGTTGG 

GTAGAAAGAA 

TCTAGAGACT 

CCCCATGGCC 

CCTGACCCTG 

CTCTACTTAG 

GACCGACCGG 

CAGACTAAGA 

ACCGCCCTCA 

GACCCCGGGG 

TTAAAAATAA 

GGGTCAGTAC 

ACATTTCTCA 

ATGAGGATAA 

CGGCAGTACA 

TTGGTAGTGC 

CTTATTCTGA 

AGTTGATTTC 

TCAAAACTAC 

ACCAAGCTTA 



AAAACAGTCC 
AGCACCCAGA 
AGCTAGACTG 
GGGCCTCGGC 
TAAAAGAGGG 
CTCCGAAGAC 
GGATCCCTGG 
TGTTTAATTG 
CTGCCCCAGC 
AGCAGGGCTA 
CCTACCTGTC 
TAGCAGCCAT 
TCATTCTGGC 
CCAACGCCCG 
GACCGGTGGT 
ACAACTGCCT 
CGCTCCCAGA 
AGCGTAAGGC 
CAGCCGGGAC 
CAGAAGGTAA 
TCCATGGAGA 
ATAAAGACGA 
TCCATTGTCC 
ACCAAGCGGC 
AAAATTCATC 
TAACCAAGTT 
CTGTGATGCC 
TCAGCTTCTC 
ACCGGGATCG 
CCAGCAAGTC 
GGGAGATCGA 
TTATAGATAC 
TCGTAACCAA 
GAACTGACAA 
GGATTGATTG 
TGAATAGAAC 
GGGTGCTCCT 
TCACCCCATA 
ACATGACAAG 
TCCAGCACGA 
TGGTACCTCA 
ACCTAGAACC 
AAGTAGACGG 
GTGGACCATC 
GGTTAACCCG 
TGCTTCGCCC 
ACAAGATCTA 
TAAACATCAT 
TATTGAAGCG 
AGTTTCGAAT 
CGAAGTAGAT 
AGACTATGCA 
GATTGAAGAA 
TCGAATTC 



CACCCTGTTT 
CTTGATCCTG 
CCAACAAGGT 
CAAGAAAGCC 
TCAGAGATGG 
CCCTCGACAA 
GTTTGCAGAA 
GGGCCCAGAC 
CCTGGGGTTG 

cgccaaaggt 

caaaaagcta 

tgccgtactg 

gccccatgca 

gatgactcac 

agccctgaac 

tgatatcctg 

cgccgaccac 

gggagctgcg 

atccgctcac 

gaagctaaat; 

aatatacaga 

gatcttggcc 

aggacatcaa 

ccgaaaggca 

accctacacc 

gggggccatt 

TGACCAGTTT 
AAAAATGAAG 
AACACTOSiAA 
TGCCGTTAAA. 
TTTCACCGAG 
CTTTTCTGGC 
GAAGCTACTA 
TGGGCCTGCC 
GAAATTACAT 
CATCAAGGAG 
ACTCCCCTTA 
TGAGATCTTA 

agttactaac 
agtctggaga 
cccttaccga 
tcgctggaaa 
catcgcagct 
ctctagactg 
cgaggccccc 
ggctccagtg 
gaattagtag 
gtgggagcgg 
tatataggac 
ggacaaaagg 
agaagtattc 
ccagattgtt' 
ctcattccac 



4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420' 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7308 
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CATATGCGGT GTGAAATACC GCACAGATGC GTAAGGAGAA AATACCGCAT CAGGCGCCAT . 60 

TCGCCATTCA GGCTGCGCAA CTGTTGGGAA GGGCGATCGG TGCGGGCCTC TTCGCTATTA 120 

CGCCAGCTGG CGAAAGGGGG ATGTGCTGCA AGGCGATTAA GTTGGGTAAC GCCAGGGTTT 180 

TCCCAGTCAC GACGTTGTAA AACGACGGCC AGTGAATTCC GATTAGTTCA ATTTGTTAAA 240 

GACAGGATCT CAGTAGTCCA GGCTTTAGTC CTGACTCAAC AATACCACCA GCTAAAACCA 300 

CTAGAATACG AGCCACAATA AATAAAAGAT TTTATTTAGT TTCCAGAAAA AGGGGGGAAT 350 

GAAAGACCCC ACCAAATTGC TTAGCCTGAT AGCCGCAGTA ACGCCATTTT GCAAGGCATG 420 

GAAAAATACC AAACCAAGAA TAGAGAAGTT CAGATCAAGG GCGGGTACAC GAAAACAGCT 480 

AACGTTGGGC CAAACAGGAT ATCTGCGGTG' AGCAGTTTCG. GCCCCGGCCC GGGGCCAAGA 540 

ACAGATGGTC ACCGCGGTTC GGCCCCGGCC CGGGGCCAAG AACAGATGGT CCCCAGATAT 600 

GGCCCAACCC TCAGCAGTTT CTTAAGACCC ATCAGATGTT TCCAGGCTCC CCCAAGGACC 660 

TGAAATGACC CTGTGCCTTA TTTGAATTAA CCAATCAGCC TGCTTCTCGC TTCTGTTCGC 720 

GCGCTTCTGC TTCCCGAGCT CTATAAAAGA GCTCACAACC CCTCACTCGG CGCGCCAGTC 780 

CTCCGATAGA CTGAGTCGCC CGGGTACCCG TGTATCCAAT AAATCCTCTT GCTGTTGCAT 840 

CCGACTCGTG GTCTCGCTGT TCCTTGGGAG GGTCTCCTCA GAGTGATTGA CTACCCGTCT 900 

CGGGGGTCTT TCATTTGGGG GCTCGTCCGG GATCTGGAGA CCCCTGCCCA GGGACCACCG 960 

ACCCACCACC GGGAGGTAAG CTGGCCAAGA TCTTATATGG GGCACCCCCG CCCCTTGTAA 1020 

ACTTCCCTGA CCCTGACATG ACCAGAGTTA CTAACAGCCC CTCTCTCCAA GCTCACTTAC 1080 

AGGCTCTCTA CTTAGTCCAG CACGAAGTTT GGAGACCACT GGCGGCAGCT TACCAAGAAC 1140 

AACTGGACCG GCCGGTGGTG CCTCACCCTT ACCGGGTCGG CGACACAGTG TGGGTCCGCC 1200 

GACATCAAAC- CAAGAACCTA GAACCTCGCT GGAAAGGACC TTAGACAGTC CTGCTGACCA 1260 

CCCCCACCGC CCTCAAAGTA GACGGTATCG CAGCTTGGAT ACACGCAGCC CACGTAAAGG 1320 

CGGCCGACAC CGAGAGTGGA CCATCCTCTG GACGGACATG GCGCGTTCAA CGCTCTCAAA 1380 

ACCCCCTCAA GATAAGATTA ACCCGTGGAA GCCCTTAATA GTCATGGGAG TCCTGTTAGG 1440 

AGTAGGGATG GCAGAGAGCC CCCATCAGGT CTTTAATGTA ACCTGGAGAG TCACCAACCT 1500 

GATGACTGGG CGTACCGCCA ATGCCACCTC CCTCCTGGGA ACTGTACAAG ATGCCTTCCC 1560 

AAAATTATAT TTTGATCTAT GTGATCTGGT CGGAGAGGAG TGGGACCCTT CAGACCAGGA ' 1620 
ACCGTATGTC GGGTATGGCT GCAAGTACCC CGCAGGGAGA CAGCGGACCC GGACTTTTGA ■ ' . 1680 

CTTTTACGTG TGCCCTGGGC ATACCGTAAA GTCGGGGTGT GGGGGACCAG GAGAGGGCTA 1740 

CTGTGGTAAA TGGGGGTGTG AAACCACCGG ACAGGCTTAC TGGAAGCCCA CATCATCGTG "1800 

GGACCTAATC TCCCTTAAGC GCGGTAACAC CCCCTGGGAC ACGGGATGCT CTAAAGTTGC 1860 

CTGTGGCCCC TGCTACGACC TCTCCAAAGT ATCCAATTCC TTCCAAGGGG CTACTCGAGG 1920 

GGGCAGATGC AACCCTCTAG TCCTAGAATT CACTGATGCA GGAAAAAAGG CTAACTGGGA 1980 
CGGGCCCAAA TCGTGGGGAC TGAGACTGTA CCGGACAGGA ACAGATCCTA TTACCATGTT , 2040 

CTCCCTGACC CGGCAGGTCC TTAATGTGGG ACCCCGAGTC CCCATAGGGC CCAACCCAGT 2100 

ATTACCCGAC CAAAGACTCC CTTCCTCACC AATAGAGATT GTACCGGCTC CACAGCCACC 2160 

TAGCCCCCTC AATACCAGTT ACCCCCCTTC CACTACCAGT ACACCCTCAA CCTCCCCTAC 2220 

AAGTCCAAGT GTCCCACAGC CACCCCCAGG AACTGGAGAT AGACTACTAG CTCTAGTCAA 2280 

AGGAGCCTAT CAGGCGCTTA ACCTCACCAA TCCCGACAAG ACCCAAGAAT' GTTGGCTGTG 2340 

CTTAGTGTCG GGACCTCCTT ATTACGAAGG AGTAGCGGTC GTGGGCACTT ATACCAATCA 2400 

TTCCACCGCT CCGGCCAACT GTACGGCCAC TTCCCAACAT AAGCTTACCC TATCTGAAGT 2460 

GACAGGACAG GGCCTATGCA TGGGGGCAGT ACCTAAAACT CACCAGGCCT TATGTAACAC 2520 

CACCCAAAGC GCCGGCTCAG GATCCTACTA CCTTGCAGCA CCCGCCGGAA CAATGTGGGC 2580 

TTGCAGCACT GGATTGACTC CCTGCTTGTC CACCACGGTG CTCAATCTAA CCACAGATTA 2640 

TTGTGTATTA GTTGAACTCT GGCCCAGAGT AATTTACCAC TCCCCCGATT ATATGTATGG 2700 

TCAGCTTGAA CAGCGTACCA AATATAAAAG AGAGCCAGTA TCATTGACCC TGGCCCTTCT 2760 

ACTAGGAGGA TTAACCATGG GAGGGATTGC AGCTGGAATA GGGACGGGGA CCACTGCCTT 2820 

AATTAAAACC CAGCAGTTTG AGCAGCTTCA TGCCGCTATC CAGACAGACC TCAACGAAGT 2880 

CGAAAAGTCA ATTACCAACC TAGAAAAGTC ACTGACCTCG TTGTCTGAAG ■ TAGTCCTACA 2940 

GAACCGCAGA GGCCTAGATT TGCTATTCCT AAAGGAGGGA GGTCTCTGCG CAGCCCTAAA 3000 

AGAAGAATGT TGTTTTTATG CAGACCACAC GGGGCTAGTG AGAGACAGCA TGGCCAAATT 3060 

AAGAGAAAGG CTTAATCAGA GACAAAAACT ATTTGAGACA GGCCAAGGAT GGTTCGAAGG 3120 

GCTGTTTAAT 'AGATCCCCCT GGTTTACCAC CTTAATCTCC ACCATCATGG GACCTCTAAT 3180 

AGTACTCTTA CTGATCTTAC TCTTTGGACC TTGCATTCTC AATCGATTAG TTCAATTTGT 3240 

TAAAGACAGG ATCTCAGTAG TCCAGGCTTT AGTCCTGACT CAACAATACC ACCAGCTAAA 3300 

GCCTATAGAG TACGAGCCAT AGGGCGCCTA GTGTTGACAA TTAATCATCG GCATAGTATA 33 60 

CGGCATAGTA TAATACGACT CACTATAGGA GGGCCACCAT GGCCAAGTTG ACCAGTGCCG 3420 

TTCCGGTGCT CACCGCGCGC GACGTCGCCG GAGCGGTCGA GTTCTGGACC GACCGGCTCG 3480 

GGTTCTCCCG GGACTTCGTG GAGGACGACT TCGCCGGTGT GGTCCGGGAC GACGTGACCC 3540 

TGTTCATCAG CGCGGTCCAG GACCAGGTGG TGCCGGACAA CACCCTGGCC TGGGTGTGGG 3 600 

TGCGCGGCCT GGACGAGCTG TACGCCGAGT GGTCGGAGGT CGTGTCCACG AACTTCCGGG 3 660 

ACGCCTCCGG GCCGGCCATG ACCGAGATCG GCGAGCAGCC GTGGGGGCGG GAGTTCGCCC 3720 

TGCGCGACCC GGCCGGCAAC TGCGTGCACT TCGTGGCCGA GGAGCAGGAC TGANNNNCGG 3780 

ACCGGTCGAC TTGrTAACTT GTTTATTGCA GCTTATAATG GTTACAAATA AAGCAATAGC 3840 

ATCACAAATT TCACAAATAA AGCATTTTTT TCACTGCATT CTAGTTGTGG TTTGTCCAAA 3900 

CTCATCAATG TATCTTATCA TGTCTGGATC CAGATCTGGG CCCATGCGGC CGCGGATCGA 3 960 

TNNNNACATG TGAGCAAAAG GCCAGCAAAA GGCCAGGAAC CGTAAAAAGG CCGCGTTGCT 4020 

GGCGTTTTTC CATAGGCTCC GCCCCCCTGA CGAGCATCAC AAAAATCGAC GCTCAAGTCA 4080 
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GAGGTGGCGA AACCCGACAG GACTATAAAG ATACCAGGCG TTTCCCCCTG GAAGCTCCCT 4140 

CGTGCGCTCT CCTGTTCCGA CCCTGCCGCT TACCGGATAC CTGTCCGCCT TTCTCCCTTC ' 4200 

GGGAAGCGTG GCGCTTTCTC AATGCTCACG CTGTAGGTAT CTCAGTTCGG TGTAGGTCGT 4260 

TCGCTCCAAG CTGGGCTGTG TGCACGAACC CCCCGTTCAG CCCGACCGCT GCGCCTTATC 4320 

CGGTAACTAT CGTCTTGAGT CCAACCCGGT AAGACACGAC TTATCGCCAC TGGCAGCAGC 43 BO 

CACTGGTAAC AGGATTAGCA GAGCGAGGTA TGTAGGCGGT GCTACAGAGT TCTTGAAGTG 4440 

GTGGCCTAAC TACGGCTACA CTAGAAGGAC AGTATTTGGT ATCTGCGCTC TGCTGAAGCC 4500 

AGTTACCTTC GGAAAAAGAG TTGGTAGCTC TTGATCCGGC AAACAAACCA CCGCTGGTAG 4560 

CGGTGGTTTT TTTGTTTGCA AGCAGCAGAT TACGCGCAGA AAAAAAGGAT CTCAAGAAGA 4620 

TCCTTTGATC TTTTCTACGG GGTCTGACGC TCAGTGGAAC GAAAACTCAC GTTAAGGGAT 4680 

TTTGGTCATG AGATTATCAA AAAGGATCTT CACCTAGATC CTTTTAAATT AAAAATGAAG 4740 

TTTTAAATCA ATCTAAAGTA TATATGAGTA AACTTGGTCT GACAGTTACC AATGCTTAAT 4800 

CAGTGAGGCA CCTATCTCAG. CGATCTGTCT ATTTCGTTCA TCCATAGTTG CCTGACTCCC 4860 

CGTCGTGTAG ATAACTACGA TACGGGAGGG CTTACCATCT GGCCCCAGTG CTGCAATGAT 4920 

ACCGCGAGAC CCACGCTCAC CGGCTCCAGA TTTATCAGCA ATAAACCAGC CAGCCGGAAG 4980 

GGCCGAGCGC AGAAGTGGTC CTGCAACTTT ATCCGCCTCC ATCCAGTCTA TTAATTGTTG 5040 

CCGGGAAGCT AGAGTAAGTA GTTCGCCAGT TAATAGTTTG CGCAACGTTG TTGCCATTGC 5100 

TACAGGCATC GTGGTGTCAC GCTCGTCGTT TGGTATGGCT TCATTCAGCT CCGGTTCCCA 5160 

ACGATCAAGG CGAGTTACAT GATCCCCCAT GTTGTGCAAA AAAGCGGTTA GCTCCTTCGG 5220 

TCCTCCGATC GTTGTCAGAA GTAAGTTGGC CGCAGTGTTA TCACTCATGG TTATGGCAGC 5280 

ACTGCATAAT TCTCTTACTG TCATGCCATC CGTAAGATGC TTTTCTGTGA CTGGTGAGTA 5340 

CTCAACCAAG TCATTCTGAG AATAGTGTAT GCGGCGACCG AGTTGCTCTT GCCCGGCGTC 5400 

AATACGGGAT AATACCGCGC CACATAGCAG AACTTTAAAA GTGCTCATCA TTGGAAAACG 5460 
TTCTTCGGGG CGAAAACTCT CAAGGATCTT ACCGCTGTTG AGATCCAGTT ' CGATGTAACC ' 5520 

CACTCGTGCA CCCAACTGAT CTTCAGCATC TTTTACTTTC ACCAGCGTTT CTGGGTGAGC 5580 

AAAAACAGGA AGGCAAAATG CCGCAAAAAA GGGAATAAGG GCGACACGGA AATGTTGAAT '5640 

ACTCATACTC TTCCTTTTTC AATATTATTG AAGCATTTAT CAGGGTTATT GTCTCATGAG 5700 

CGGATACATA TTTGAATGTA TTTAGAAAAA TAAACAAATA GGGGTTCCGC GCACATTTCC 5760 

CCGAAAAGTG CCACCTGACG TCTAAGAAAC CATTATTATC ATGACATTAA CCTATAAAAA 5820 

TAGGCGTATC ACGAGGCCCT. TTCGTCTCGC GCGTTTCGGT GATGACGGTG AAAACCTCTG ' -5880 

ACACATGCAG CTCCCGGAGA CGGTCACAGC TTGTCTGTAA GCGGATGCCG GGAGCAGACA 5940 

AGCCCGTCAG GGCGCGTCAG CGGGTGTTGG CGGGTGTCGG GGCTGGCTTA ACTATGCGGC 6000 

ATCAGAGCAG ATTGTACTGA GAGTGCAC 6028 
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CATATGCGGT GTGAAATACC GCACAGATGC GTAAGGAGAA AATACCGCAT CAGGCGCCAT 60 
TCGCCATTCA GGCTGCGCAA CTGTTGGGAA GGGCGATCGG TGCGGGCCTC TTCGCTATTA ' 120 

CGCCAGCTGG CGAAAGGGGG ATGTGCTGO^ AGGCGATTAA GTTGGGTAAC GCCAGGGTTT 180 

TCCCAGTCAC GACGTTGTAA AACGACGGCC AGTGAATTCC GATTAGTTCA ATTTGTTAAA 240 

GACAGGATCT CAGTAGTCCA GGCTTTAGTC CTGACTCAAC AATACCACCA GCTAAAACCA 300 

CTAGAATACG AGCCACAATA AATAAAAGAT TTTATTTAGT TTCCAGAAAA AGGGGGGAAT 350 

GAAAGACCCC ACCAAATTGC TTAGCCTGAT AGCCGCAGTA ACGCCATTTT GCAAGGCATG 420 

GAAAAATACC AAACCAAGAA TAGAGAAGTT CAGATCAAGG GCGGGTACAC GAAAACAGCT 480 

AACGTTGGGC CAAACAGGAT ATCTGCGGTG AGCAGTTTCG GCCCCGGCCC GGGGCCAAGA 540 ' 

ACAGATGGTC ACCGCGGTTC GGCCCCGGCC CGGGGCCAAG AACAGATGGT CCCCAGATAT 600 

GGCCCAACCC TCAGCAGTTT CTTAAGACCC ATCAGATGTT TCCAGGCTCC CCCAAGGACC 660 

TGAAATGACC CTGTGCCTTA TTTGAATTAA GCAATCAGCC TGCTTCTCGC TTCTGTTCGC 720 

GCGCTTCTGC TTCCCGAGCT CTATAAAAGA GCTCACAACC CCTCACTCGG CGCGCCAGTC 780 

CTCCGATAGA CTGAGTCGCC CGGGTACCCG TGTATCCAAT AAATCCTCTT GCTGTTGCAT 840 

CCGACTCGTG GTCTCGCTGT TCGTTGGGAG GGTCTCCTCA GAGTGATTGA CTACCCGTCT 900 

CGGGGGTCTT TCATTTGGGG GCTCGTCCGG GATCTGGAGA CCCCTGCCCA GGGACCACCG 960 

ACCCACGACC GGGAGGTAAG CTGGCCAAGA TCTTATATGG GGCACCCCCG CCCCTTGTAA 1020 

ACTTCCCTGA CCCTGACATG ■ ACAAGAGTTA CTAACAGCCC CTCTCTCCAA GCTCACTTAC 1080 

AGGCTCTCTA CTTAGTCCAG CACGAAGTCT GGAGACCTCT GGCGGCAGCC TACCAAGAAC 1140 

AACTGGACCG ACCGGTGGTA CCTCACCCTT ACCGAGTCGG CGACACAGTG TGGGTCCGCC 1200 

GACACCAGAC TAAGAACCTA GAACCTCGCT GGAAAGGACC TTACACAGTC CTGCTGACCA 1260 

CCCCCACCGC CCTCAAAGTA. GACGGCATCG CAGCTTGGAT ACACGCCGCC CACGTGAAGG 1320 
CTGCCGACCC CGGGGGTGGA ' CCATCCTCTA GACTGACATG GCGCGTTCAA CGCTCTCAAA , 1380 

ACCCCTTAAA AATAAGGTTA ACCCGCGAGG CCCCCTAATC CCCTTAATTC TTCTGATGCT 1440 
CAGAGGGGTC AGTACTGCTT CGCCCGGCTC CAGTCCTCAT CAAGTCTATA ATATCACCTG ' 1500 

GGAGGTAACC AATGGAGATC GGGAGACGGT ATGGGCAACT TCTGGCAACC ACCCTCTGTG 1560 

GACCTGGTGG CCTGACCTTA CCCCAGATTT ATGTATGTTA GCCCACCATG GACCATCTTA 1620 

TTGGGGGCTA GAATATCAAT CCCCTTTTTC TTCTCCCCCG- GGGCCCCCTT GTTGCTCAGG 1680 

GGGCAGCAGC CCAGGCTGTT CCAGAGACTG CGAAGAACCT TTAACCTCCC TCACCCCTCG 1740 

GTGCAACACT GCCTGGAACA GACTCAAGCT AGACCAGACA' ACTCATAAAT CAAATGAGGG 1800 

ATTTTATGTT TGCCCCGGGC CCCACCGCCC CCGAGAATCC AAGTCATGTG GGGGTCCAGA 1860 

CTCCTTCTAC TGTGCCTATT GGGGCTGTGA GACAACCGGT AGAGCTTACT GGAAGCCCTC 1920 

CTCATCATGG GATTTCATCA CAGTAAACAA CAATCTCACC TCTGACCAGG CTGTCCAGGT 1980 

ATGCAAAGAT AATAAGTGGT GCAACCCCTT AGTTATTCGG TTTACAGACG CCGGGAGACG. 2040 

GGTTACTTCC TGGACCACAG GACATTACTG GGGCTTACGT TTGTATGTCT CCGGACAAGA 2100" 

TCCAGGGCTT ACATTTGGGA TCCGACTCAG ATACCAAAAT CTAGGACCCC GCGTCCCAAT 2160 

AGGGCCAAAC CCCGTTCTGG CAGACCAACA GCCACTCTCC AAGCCCAAAC CTGTTAAGTC 2220 

GCCTTCAGTC ACCAAACCAC CCAGTGGGAC TCCTCTCTCC CCTACCCAAC TTCCACCGGC 2280 

GGGAACGGAA AATAGGCTGC TAAACTTAGT AGACGGAGCC TACCAAGCCC TCAACCTCAC 2340 

CAGTGCTGAC AAAACCCAAG AGTGCTGGTT GTGTCTAGTA GCGGGACCCC CCTACTACGA 2400 

AGGGGTTGCC GTCCTGGGTA CCTACTCCAA CCATACCTCT GCTCCAGCCA ACTGCTCCGT 2460 

GGCCTCCCAA CACAAGTTGA CCCTGTCCGA AGTGACCGGA CAGGGACTCT GCATAGGAGC 2520 

AGTTCCCAAA ACACATCAGG CCCTATGTAA TACCACCCAG ACAAGCAGTC GAGGGTCCTA 2580 
TTATCTAGTT GCCCCTACAG GTACCATGTG GGCTTGTAGT ACCGGGCTTA CTCCATGCAT . 2640 

CTCCACCACC ATACTGAACC TTACCACTGA TTATTGTGTT CTTGTCiSAAC TCTGGCCAAG 2700 

AGTCACCTAT CATTCCCCCA GCTATGTTTA CGGCCTGTTT GAGAGATCCA ACCGACACAA 2760 

AAGAGAACCG GTGTCGTTAA CCCTGGCCCT ATTATTGGGT GGACTAACCA TGGGGGGAAT 2820- 

TGCCGCTGGA ATAGGAACAG GGACTACTGC TCTAATGGCC ACTCAGCAAT TCCAGCAGCT 2880 

CCAAGCCGCA GTACAGGATG ATCTCAGGGA GGTTGAAAAA TCAATCTCTA ACCTAGAAAA 2940 

GTCTCTCACT TCCCTGTCTG AAGTTGTCCT ACAGAATCGA AGGGGCCTAG ACTTGTTATT 3000 

TCTAAAAGAA GGAGGGCTGT GTGCTGCTCT AAAAGAAGAA TGTTGCTTCT ATGCGGACCA 3060 

CACAGGACTA GTGAGAGACA GCATGGCCAA ATTGAGAGAG AGGCTTAATC AGAGACAGAA 3120 

ACTGTTTGAG TCAACTCAAG GATGGTTTGA GGGACTGTTT AACAGATCCC CTTGGTTTAC 3180 

CACCTTGATA TCTACCATTA TGGGACCCCT CATTGTACTC CTAATGATTT TGCTCTTCGG 3240 

ACCCTGCATT CTTAATCGAT TAGTTCAATT TGTTAAAGAC AGGATCTCAG TAGTCCAGGC 3300 

TTTAGTCCTG ACTCAACAAT ACCACCAGCT AAAGCCTATA GAGTACGAGC CATAGGGCGC 3360 

CTAGTGTTGA CAATTAATCA TCGGCATAGT ATACGGCATA GTATAATACG ACTCACTATA 3420 

GGAGGGCCAC CATGGCCAAG TTGACCAGTG CCGTTCCGGT GCTCACCGCG CGCGACGTCG 3480 

CCGGAGCGGT CGAGTTCTGG ACCGACCGGC TCGGGTTCTC CCGGGACTTC GTGGAGGACG 3540 

ACTTCGCCGG TGTGGTCCGG GACGACGTGA CCCTGTTCAT CAGCGCGGTC CAGGACCAGG 3 600 

TGGTGCCGGA CAACACCCTG GCCTGGGTGT GGGTGCGCGG CCTGGACGAG CTGTACGCCG 3660 

AGTGGTCGGA GGTCGTGTCC ACGAACTTCC GGGACGCCTC CGGGCCGGCC ATGACCGAGA 3720 

TCGGCGAGCA GCCGTGGGGG CGGGAGTTCG CCCTGCGCGA CCCGGCCGGC AACTGCGTGC 3780 

ACTTCGTGGC CGAGGAGCAG GACTGANNNN CGGACCGGTC GACTTGTTAA CTTGTTTATT 3840 

GCAGCTTATA ATGGTTACAA ATAAAGCAAT AGCATCACAA ATTTCACAAA TAAAGCATTT 3900 

TTTTCACTGC ATTCTAGTTG TGGTTTGTCC AAACTCATCA ATGTATGTTA TCATGTCTGG 3 960 

ATCCAGATCT GGGCCCATGC GGCCGCGGAT CGATN13NNAC ATGTGAGCAA AAGGCCAGCA 4020 

AAAGGCCAGG AACCGTAAAA AGGCCGCGTT GCTGGCGTTT TTCCATAGGC TCCGCCCCCC 4080 
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TGACGAGCAT CACAAAAATC GACGCTCAAG TCAGAGGTGG. CGAAACCCGA CAGGACTATA - 4140 

AAGATACCAG GCGTTTCCCC CTGGAAGCTC CCTCGTGCGC TCTCCTGTTC CGACCCTGCC 42 00 

GCTTACCGGA TACCTGTCCG CCTTTCTCCC TTCGGGAAGC GTGGCGCTTT CTCAATGCTC 42 60 

ACGCTGTAGG TATCTCAGTT CGGTGTAGGT CGTTCGCTCC AAGCTGGGCT GTGTGCACGA 4320 

ACCCCCCGTT CAGCCCGACC GCTGCGCCTT ATCCGGTAAC TATCGTCTTG AGTCCAACCC 43 80 

GGTAAGACAC GACTTATCGC CACTGGCAGC AGCCACTGGT AACAGGATTA GCAGAGCGAG 4440 

GTATGTAGGC GGTGCTACAG AGTTCTTGAA GTGGTGGCCT AACTACGGCT ACACTAGAAG 4500 

GACAGTATTT GGTATCTGCG CTCTGCTGAA GCCAGTTACC TTCGGAAAAA GAGTTGGTAG 4560 

CTCTTGATCC GGCAAACAAA CCACCGCTGG TAGCGGTGGT TTTTTTGTTT GCAAGCAGCA 4620 

GATTACGCGC AGAAAAAAAG GATCTCAAGA AGATCCTTTG ATCTTTTCTA CGGGGTCTGA 4680 

CGCTCAGTGG AACGAAAACT CACGTTAAGG GATTTTGGTC ATGAGATTAT CAAAAAGGAT 4740 

CTTCACCTAG ATCCTTTTAA ATTAAAAATG AAGTTTTAAA TCAATCTAAA GTATATATGA 48,00 

GTAAACTTGG TCTGACAGTT. ACGAATGCTT AATC^GTGAG GCACCTATCT CAGCGATCTG 4860 

TCTATTTCGT TCATCCATAG TTGCCTGACT CCCCGTCGTG TAGATAACTA CGATACGGGA 4920 

GGGCTTACCA TCTGGCCCCA GTGCTGCAAT GATACCGCGA GACCCACGCT CACCGGCTCC 4980 

AGATTTATCA GCAATAAACC AGCCAGCCGG AAGGGCCGAG CGCAGAAGTG GTCCTGCAAC 5040' 

TTTATCCGCC TCCATCCAGT CTATTAATTG TTGCCGGGAA GCTAGAGTAA GTAGTTCGCC 5100 

AGTTAATAGT TTGCGCAACG TTGTTGCCAT TGCTACAGGC ATCGTGGTGT CACGCTCGTC 5160 

GTTTGGTATG GCTTCATTCA GCTCCGGTTC CCAACGATCA AGGCGAGTTA CATGATCCCC 5220 

CATGTTGTGC. AAAAAAGCGG TTAGCTCCTT CGGTCCTCCG ATCGTTGTCA GAAGTAAGTT 5280 

GGCCGCAGTG TTATCACTCA TGGTTATGGC AGCACTGCAT AATTCTCTTA CTGTCATGCC 5340 

ATCCGTAAGA TGCTTTTCTG TGACTGGTGA GTACTCAACC AAGTCATTCT GAGAATAGTG 5400 

TATGCGGCGA CCGAGTTGCT CTTGCCCGGC GTCAATACGG GATAATACCG CGCCACATAG 5460 

CAGAACTTTA AAAGTGCTCA TCATTGGAAA ACGTTCTTCG GGGCGAAAAC TCTCAAGGAT 5520 

. CTTACCGCTG TTGAGATCCA GTTCGATGTA ACCCACTCGT GCACCCAACT GATCTTCAGC 5580 

ATCTTTTACT TTCACCAGCG TTTCTGGGTG . AGC AAAAACA GGAAGGCAAA ATGCCGCAAA 5640 

AAAGGGAATA AGGGCGACAC GGAAATGTTG AATACTCATA CTCTTCCTTT TTCAATATTA 5700 
TTGAAGCATT TATCAGGGTT ATTGTCTCAT GAGCGGATAG ATATTTGAAT GTATTTAGAA ■■ 5760' 

AAATAAACAA ATAGGGGTTC CGCGCACATT TCCCCGAAAA GTGCCACCTG ACGTCTAAGA 5820 

AACCATTATT ATCATGACAT TAACCTATAA AAATAGGCGT ATCACGAGGC CCTTTCGTCT 5880 

CGCGCGTTTC GGTGATGACG GTGAAAACCT CTGACACATG CAGCTCCCGG AGACGGTCAC 5940 

AGCTTGTCTG TAAGCGGATG CCGGGAGCAG ACAAGCCCGT CAGGGCGCGT CAGCGGGTGT 6000 

TGGCGGGTGT CGGGGCTGGC TTAACTATGC GGCATCAGAG CAGATTGTAC TGAGAGTGCA 6060 
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CATATGCGGT GTGAAATACC GCACAGATGC GTAAGGAGAA AATACCGCAT CAGGCGCCAT 60 
TCGCCATTCA GGCTGCGCAA CTGTTGGGAA GGGCGATCGG TGCGGGCCTC TTCGCTATTA " 120 

CGCCAGCTGG CGAAAGGGGG ATGTGCTGCA AGGCGATTAA GTTGGGTAAC GCCAGGGTTT 180 

TCCCAGTCAC GACGTTGTAA AACGACGGCC AGTGAATTCC GATTAGTTCA ATTTGTTAAA 240 

GACAGGAtCT CAGTAGTCCA GGCTTTAGTC CTGACTCAAC AATACCACCA GCTAAAACCA 300 

CTAGAATACG AGCCACAATA AATAAAAGAT TTTATTTAGT TTCCAGAAAA AGGGGGGAAT 3 60 

GAAAGACCCC ACCAAATTGC TTAGCCTGAT AGCCGCAGTA ACGCCATTTT GCAAGGCATG 420 

GAAAAATACC AAACCAAGAA TAGAGAAGTT CAGATCAAGG GCGGGTACAC GAAAACAGCT 480 

AACGTTGGGC CAAACAGGAT ATCTGCGGTG AGCAGTTTCG GCCCCGGCCC GGGGCCAAGA 5^0 

ACAGATGGTC ACCGCGGTTC GGCCCCGGCC CGGGGCCAAG AACAGATGGT CCCCAGATAT 600 

GGCCCAACCC TCAGCAGTTT CTTAAGACCC ATCAGATGTT TCCAGGCTCC CCCAAGGACC 660 

TGAAATGACC CTGTGCCTTA TTTGAATTAA CCAATCAGCC TGCTTCTCGC TTCTGTTCGC 720 

GCGCTTCTGC TTCCCGAGCT CTATAAAAGA GCTCACAACC CCTCACTCGG CGCGCCAGTC 780 

CTCCGATAGA CTGAGTCGCC CGGGTACCCG TGTATCCAAT AAATCCTCTT GCTGTTGCAT 840 

CCGACTCGTG GTCTCGCTGT TCCTTGGGAG GGTCTCCTCA GAGTGATTGA CTACCCGTCT 900 

CGGGGGTCTT TCATTTGGGG GCTCGTCCGG GATCTGGAGA CCCCTGCCCA GGGACCACCG 960 

ACCCACCACC GGGAGGTAAG CTGGCCAAGA TCCCTAAGGT ACTCGGGTCA GACAATGGCC 1020 
CGGCCTTTGT TGCTCAGGTA AGTCAGGGAC TGGCCACTCA ACTGGGGATA AATTGGAAGT ' 1080 

TACATTGTGC GTATAGACCC CAGAGCTCAG GTCAGGTAGA AAGAATGAAC AGAACAATTA 1140 

AAGAGACCTT GACCAAATTA GCCTTAGAGA CCGGTGGAAA AGACTGGGTG ACCCTCCTTC 1200 

CCTTAGCGCT GCTTAGGGCC AGGAATACCC CTGGCCGGTT TGGTTTAACT CCTTATGAAA 12 60 

TTCTCTATGG AGGACCACCC CCCATACTTG AGTCTGGAGA AACTTTGGGT CCCGATGATA 1320 

GATTTCTCCC TGTCTTATTT ACTCACTTAA AGGCTTTAGA AATTGTAAGG ACCCAAATCT 1380 

GGGACCAGAT CAAAGAGGTG TATAAGCCTG GTACCGTAAC AATCCCTCAC CCGTTCCAGG 1440 

TCGGGGATCA AGTGCTTGTC AGACGCCATC GACCCAGCAG CCTTGAGCCT CGGTGGAAAG 1500 

GCCCATACCT GGTGTTGCTG ACTACCCCGA CCGCGGTAAA AGTCGATGGT ATTGCTGCCT 1560 

GGGTCCATGC TTCTCACCTC AAACCTGCAC CACCTTCGGC ACCAGATGAG TCCTGGGAGC 1620 
TGGAAAAGAC TGATCATCCT CTTAAGCTGC GTATTCGGCG GCGGCGGGAC GAGTCTGCAA " 1680. 
AATAAGAACC CCCACCAGCC CATGACCCTC ACTTGGCAGG TACTGTCCCA AACTGGAGAC " 1740 

GTTGTCTGGG ATACAAAGGC AGTCCAGCCC CCTTGGACTT GGTGGCCCAC ACTTAAACCT 1800 

GATGTATGTG CCTTGGCGGC TAGTCTTGAG TCCTGGGATA TCCCGGGAAC CGATGTCTCG 1860 

TCCTCTAAAC GAGTCAGACC TCCGGACTCA GACTATACTG CCGCTTATAA GCAAATCACC 1920 

TGGGGAGCCA TAGGGTGCAG CTACCCTCGG GCTAGGACTA GAATGGCAAG CTCTACCTTC 1980 

TACGTATGTC CCCGGGATGG CCGGACCCTT TCAGAAGCTA GAAGGTGCGG GGGGCTAGAA 2040 

TCCCTATACT GTAAAGAATG GGATTGTGAG ACCACGGGGA ,CCGGTTATTG GCTATCTAAA 2100 

TCCTCAAAAG ACCTCATAAC TGTAAAATGG GACCAAAATA GCGAATGGAC TCAAAAATTT 2160 

CAACAGTGTC ACCAGACCGG CTGGTGTAAC CCCCTTAAAA TAGATTTCAC AGACAAAGGA 2220 

AAATTATCCA AGGACTGGAT AACGGGAAAA ACCTGGGGAT TAAGATTCTA TGTGTCTGGA 2280 

CATCCAGGCG TACAGTTCAC CATTCGCTTA AAAATCACCA ACATGCCAGC TGTGGCAGTA 2340 

GGTCCTGACC TCGTCCTTGT GGAACAAGGA CCTCCTAGAA CGTCCCTCGC TCTCCCACCT 2400 

CCTCTTCCCC CAAGGGAAGC GCCACCGCCA TCTCTCCCCG ACTCTAACTC CACAGCCCTG 2460 

GCGACTAGTG CACAAACTCC CACGGTGAGA AAAACAATTG TTACCCTAAA CACTCCGCCT 2520 

CCCACCACAG GCGACAGACT TTTTGATCTT 6TGCAGGGGG CCTTCCTAAC CTTAAATGCT 2580 

ACCAACCCAG GGGCCACTGA GTCTTGCTGG CTTTGTTTGG CCATGGGCCC CCCTTATTAT 2 640 

GAAGCAATAG CCTCATCAGG AGAGGTCGCC. TACTCCACCG ACCTTGACCG GTGCCGCTGG 2700 

GGGACCCAAG GAAAGCTCAC CGTCACTGAG GTCTCAGGAC ACGGGTTGTG CATAGGAAAG 2760 

GTGCCCTTTA CCCATCAGCA TCTCTGCAAT CAGACCCTAT CCATCAATTC CTCCGGAGAC 2 820 

CATCAGTATC TGCTCCCCTC CAACCATAGC TGGTGGGCTT GCAGCACTGG CCTCACCCCT 2880 

TGCCTCTCCA CCTCAGTTTT TAATCAGACT AGAGATTTCT GTATCCAGGT CCAGCTGATT 2940 

CCTCGCATCT ATTACTATCC TGAAGAAGTT TTGTTACAGG CCTATGACAA TTCTCACCCC 3000 

AGGACTAAAA GAGAGGCTGT CTCACTTACC CTAGCTGTTT TACTGGGGTT GGGAATCACG 3060 

GCGGGAATAG GTACTGGTTC AACTGCCTTA ATTAAAGGAC CTATAGACCT CCAGCAAGGC 3120 

CTGACAAGCC TCCAGATCGC CATAGATGCT GACCTCCGGG CCCTCCAAGA CTCAGTCAGC 3180 

AAGTTAGAGG ACTCACTGAC TTCCCTGTCC GAGGTAGTGC TCCAAAATAG GAGAGGCCTT 3240 

GACTTGCTGT TTCTAAAAGA AGGTGGCCTC TGTGCGGCCC TAAAGGAAGA GTGCTGTTTT 33 00 

TACATAGACC ACTCAGGTGC AGTACGGGAC TCCATGAAAA AACTCAAAGA AAAACTGGAT 33 60 

AAAAGACAGT TAGAGCGCCA GAAAAGCCAA AACTGGTATG AAGGATGGTT CAATAACTCC 3420 

CCTTGGTTCA CTACCCTGCT ATCAACCATC GCTGGGCCCC TATTACTCCT CCTTCTGTTG 3480 

CTCATCCTCG GGCCATGCAT CATCAATCGA TTAGTTCAAT TTGTTAAAGA CAGGATCTCA 3540 

GTAGTCCAGG CTTTAGTCCT GACTCAACAA TACCACCAGC TAAAGCCTAT AGAGTACGAG 3600 

CCATAGGGCG CCTAGTGTTG ACAATTAATC ATCGGCATAG TATACGGCAT AGTATAATAC 3 6 60 

GACTCACTAT AGGAGGGCCA CCATGGCCAA GTTGACCAGT GCCGTTCCGG TGCTCACCGC 3720 

GCGCGACGTC GCCGGAGCGG TCGAGTTCTG GACCGACCGG CTCGGGTTCT CCCGGGACTT 3780 

CGTGGAGGAC GACTTCGCCG GTGTGGTCCG GGACGACGTG ACCCTGTTCA TCAGCGCGGT 3 840 

CCAGGACCAG GTGGTGCCGG ACAACACCCT GGCCTGGGTG TGGGTGCGCG GCCTGGACGA 3900 

GCTGTACGCC GAGTGGTCGG AGGTCGTGTC CACGAACTTC CGGGACGCCT CCGGGCCGGC 3 960 

CATGACCGAG ATCGGCGAGC AGCCGTGGGG GCGGGAGTTC GCCCTGCGCG ACCCGGCCGG 402 0 

CAACTGCGTG CACTTCGTGG CCGAGGAGCA GGACTGANNN NCGGACCGGT CGACTTGTTA 4080 
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ACTTGTTTAT 
ATAAAGCATT 
ATCATGTCTG 
AAAGGCCAGC 
CTCCGCCCCC 
ACAGGACTAT 
CCGAGCCTGC 
TCTCAATGCT 
TGTGTGCACG 
GAGTCCAACC 
AGCAGAGCGA 
TACACTAGAA 
AGAGTTGGTA 
TGCAAGCAGC 
ACGGGGTCTG 
TCAAAAAGGA 
AGTATATATG 
TCAGCGATCT 
ACGATACGGG 
TCACCGGCTC 
GGTCCTGCAA 
AGTAGTTCGC 
TCACGCTCGT 
ACATGATCCC 
AGAAGTAAGT 
ACTGTCATGC 
TGAGAATAGT 
GCGCCACATA 
CTCTCAAGGA 
TGATCTTCAG 
AATGCCGCAA 
TTTCAATATT 
TGTATTTAGA 
GACGTCTAAG 
CCCTTTCGTC 
GAGACGGTCA 
TCAGCGGGTG 
CTGAGAGTGC 



TGCAGCTTAT 
TTTTTCACTG 
GATCCAGATC 
AAAAGGCCAG 
CTGACGAGCA 
AAAGATACCA 
CGCTTACCGG 
CACGCTGTAG 
AACCCCCCGT 
CGGTAAGACA 
GGTATGTAGG 
GGACAGTATT 
GCTCTTGATC 
AGATTACGCG 
ACGCTCAGTG 
TCTTCACCTA 
AGTAAACTTG 
GTCTATTTCG 
AGGGCTTACC 
CAGATTTATC 
CTTTATCCGC 
CAGTTAATAG 
CGTTTGGTAT 
CCATGTTGTG 
TGGCCGCAGT 
CATCCGTAAG 
GTATGCGGCG 
GCAGAACTTT- 
TCTTACCGCT 
CATCTTTTAC 
AAAAGGGAAT. 
ATTGAAGCAT 
AAAATAAACA 
AAACCATTAT 
TCGCGCGTTT 
CAGCTTGTCT 
TTGGCGGGTG 
AC 



AATGGTTACA 

CATTCTAGTT 

TGGGCCCATG 

GAACCGTAAA 

TCACAAAAAT 

GGCGTTTCCC 

ATACCTGTCC 

GTATCTCAGT 

TCAGCCCGAC 

CGACTTATCG 

CGGTGCTACA 

TGGTATCTGC 

CGGCAAACAA 

CAGAAAAAAA 

GAACGAAAAC 

GATCCTTTTA 

GTCTGACAGT 

TTCATCCATA 

ATCTGGCCCC 

AGCAATAAAC 

CTCCATCCAG 

TTTGCGCAAC ' 

GGCTTCATTC 

CAAAAAAGCG 

GTTATCACTC 

ATGCTTTTCT 

ACCGAGTTGC 

AAAAGTGCTC 

GTTGAGATCC 

TTTCAGCAGC 

AAGGGCGACA 

TTATCAGGGT 

AATAGGGGTT 

TATCATGACA 

CGGTGATGAC 

GTAAGCGGAT 

TCGGGGCTGG 



AATAAAGCAA 
GTGGTTTGTC 
CGGCCGCGGA 
AAGGCCGCGT 
CGACGCTCAA 
CCTGGAAGCT 
GCCTTTCTCC 
TCGGTGTAGG 
CGCTGCGCCT 
CCACTGGCAG 
GAGTTCTTGA 
GCTCTGCTGA 
ACCACCGCTG 
GGATCTCAAG 
TCACGTTAAG 
AATTAAAAAT 
TACCAATGCT 
GTTGCCTGAC 
AGTGCTGCAA 
CAGCCAGCCG 
TCTATTAATT 
GTTGTTGCCA 
AGCTCCGGTT 
GTTAGCTCCT 
ATGGTTATGG 
GTGACTGGTG 
.TCTTGCCCGG 
ATCATTGGAA 
AGTTCGATGT 
GTTTCTGGGT 
CGGAAATGTT 
TATTGTCTCA 
CCGCGCACAT 
TTAACCTATA 
GGTGAAAACC 
GCCGGGAGCA 
CTTAACTATG 



TAGCATCACA 
CAAACTCATC 
TCGATNNI3NA 
TGCTGGCGTT 
GTCAGAGGTG 
CCCTCGTGCG 
CTTCGGGAAG 
TCGTTCGCTC 
TATCCGGTAA 
CAGCCACTGG 
AGTGGTGGCC 
AGCCAGTTAC 
GTAGCGGTGG 
AAGATCCTTT 
GGATTTTGGT 
GAAGTTTTAA 
TAATCAGTGA 
TCCCCGTCGT 
TGATACCGCG 
GAAGGGCCGA 
GTTGCCGGGA 
TTGCTACAGG 
CCCAACGATC 
TCGGTCCTCC 
CAGCACTGCA 
AGTACTCAAC 
CGTCAATACG 
AACGTTCTTC 
AACCCACTCG 
GAGCAAAAAC 
GAATACTCAT 
TGAGCGGATA 
TTCCCCGAAA 
AAAATAGGCG 
TCTGACACAT 
GACAAGCCCG 
CGGCATCAGA 



AATTTCACAA 
AATGTATCTT 
CATGTGAGCA 
TTTCCATAGG 
GCGAAACCCG 
CTCTCCTGTT 
CGTGGCGCTT 
CAAGCTGGGC 
CTATCGTCTT 
TAACAGGATT 
TAACTACGGC 
CTTCGGAAAA 
TTTTTTTGTT 
GATCTTTTCT 
CATGAGATTA 
ATCAATCTAA 
GGCACCTATC 
GTAGATAACT 
AGACCCACGC 
GCGCAGAAGT 
AGCTAGAGTA 
CATCGTGGTG 
AAGGCGAGTT 
GATCGTTGTC 
TAATTCTCTT 
CAAGTCATTC 
GGATAATACC 
GGGGCGAAAA 
TGCACCCAAC 
AGGAAGGCAA 
ACTCTTCCTT 
CATATTTGAA 
AGTGCCACCT 
TATCACGAGG 
GCAGCTCCCG 
TCAGGGCGCG 
GCAGATTGTA 



4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 

. 5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
'5880 
.5940 
6000 
6060 
6120 

' 6180 
6240 
6300 
6312 
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CATATGCGGT GTGAAATACC GCACAGATGC GTAAGGAGAA AATACCGCAT CAGGCGCCAT 60 

TCGCCATTCA GGCTGCGCAA CTGTTGGGAA GGGCGATCGG TGCGGGCCTC TTCGCTATTA " 120 

CGCCAGCTGG CGAAAGGGGG ATGTGCTGCA AGGCGATTAA GTTGGGTAAC GCCAGGGTTT 180 

TCCCAGTCAC GACGTTGTAA AACGACGGCC AGTGAATTCC .GATTAGTTCA ATTTGTTAAA ■ • 240' 

GACAGGATCT CAGTAGTCCA 'GGCTTTAGTC CTGACTCAAC AATACCACCA GCTAAAACCA 300 

CTAGAATACG AGCCACAATA AATAAAAGAT TTTATTTAGT TTCCAGAAAA AGGGGGGAAT 3 60 

GAAAGACCCC ACCAAATTGC TTAGCCTGAT AGCCGCAGTA ACGCCATTTT GCAAGGCATG 420 

GAAAAATACC AAACCAAGAA TAGAGAAGTT CAGATCAAGG GCGGGTACAC GAAAACAGCT 480 

AACGTTGGGC CAAACAGGAT ATCTGCGGTG AGCAGTTTCG GCCCCGGCCC GGGGCCAAGA 540 

ACAGATGGTC ACCGCGGTTC GGCCCCGGCC CGGGGCCAAG AACAGATGGT CCCCAGATAT • . 600 

GGCCCAACCC TCAGCAGTTT CTTAAGACCC ATCAGATGTT TCCAGGCTCC CCCAAGGACC 660 

TGAAATGACC CTGTGCCTTA TTTGAATTAA CCAATCAGCC TGCTTCTCGC TTCTGTTCGC 720 

GCGCTTCTGC TTCCCGAGCT CTATAAAAGA GCTCACAACC CCTCACTCGG CGCGCCAGTC 780 

CTCCGATAGA CTGAGTCGCC CGGGTACCCG TGTATCCAAT AAATCCTCTT GCTGTTGCAT 840 

CCGACTCGTG GTCTCGCTGT TCCTTGGGAG GGTCTCCTCA GAGTGATTGA CTACCCGTCT 900 

CGGGGGTCTT TCATTTGGGG GCTCGTCCGG GATCTGGAGA CCCCTGCCCA GGGACCACCG ' 960 

ACCCACCACC GGGAGGTAAG CTGGCCAAGA TCCCCCGGGC TGCAGGAATT TATGAAATCC 1020 

TTTATGGGGG ACCCCCCCCT TTGTCAACCT TGCTCAATTC CTTCTCCCCC TCCGATCCTA 1080 

AGACTGATTT ACAAGCCCGA CTAAAAGGGC TGCAAGGCGT GCAGGCCCAA ATCTGGACAC 1140 

CCCTGGCCGA ATTGTACCGG CCAGGACATC CACAAACTAG CCACCCATTT CAGGTGGGAG 1200 

ACTCCGTGTA CGTCCGGCGG CACCGCTCTC AAGGATTGGA GCCTCGTTGG AAGGGACCTT 1260 

ACATCGTCCT GCTGACCACG CGCACCGCCA TAAAGGTTGA CGGGATCGCC GCCTGGATTC 1320 

ACGCATCGCA CGCCAAGGCA GCCCCAAAAA CCCCTGGACC AGAAACTCCC AAAACCTGGA 13 80 

AGCTCCGCCG TTCGGAGAAC CCTCTTAAGA TAAGACTCTC CCGTGTCTGA CTGCTAATCC 1440 

ACCTTGTCCC TGTACTAACC CAAAATGAAA CTCCCAACAG GAATGGTCAT TTTATGTAGC 1500 

CTAATAATAG TTCGGGCAGG GTTTGACGAC CCCCGCAAGG CTATCGCATT AGTACAAAAA 1560 

CAACATGGTA AACCATGCGA ATGCAGCGGA GGGCAGGTAT CCGAGGCCCC ACCGAACTCC ' 1620 

ATCCAACAGG TAACTTGCCC AGGCAAGACG GCCTACTTAA TGACCAACCA AAAATGGAAA 1680 

TGCAGAGTCA CTCCAAAAAT CTCAGCTAGC GGGGGAGAAC ■ TCCAGAACTG CCCCTGTAAC 1740 

ACTTTCCAGG ACTCGATGCA CAGTTCTTGT TATACTGAAT ACCGGCAATG CAGGCGAATT 1800 ■ 

AATAAGACAT ACTACACGGC CACCTTGCTT AAAATACGGT CTGGGAGCCT CAACGAGGTA - - 1860 

CAGATATTAC AAAACCCCAA TCAGCTCCTA CAGTCCCCTT GTAGGGGCTC TATAAATCAG 1920 

CCCGTTTGCT GGAGTGCCAC AGCCCCCATC CATATCTCCG ATGGTGGAGG ACCCCTCGAT 1980 

ACTAAGAGAG TGTGGACAGT CCAAAAAAGG CTAGAACAAA TTCATAAGGC TATGACTCCT 2040 

GAACTTCAAT ACCACCCCTT AGCCCTGCCC AAAGTCAGAG. ATGACCTTAG CCTTGATGCA 2100 

CGGACTTTTG ATATCCTGAA TACCACTTTT AGGTTACTCC -AGATGTCCAA ■ TTTTAGCCTT 2160 

GCCCAAGATT GTTGGCTCTG TTTAAAACTA GGTACCCCTA CCCCTCTTGC GATACCCACT 2220 

CCCTCTTTAA CCTACTCCCT AGCAGACTCC CTAGCGAATG CCTCCTGTCA GATTATACCT 2280 

CCCCTCTTGG TTCAACCGAT GCAGTTCTCC AACTCGTCCT GTTTATCTTC CCCTTTCATT ■ 2340 

AACGATACGG AACAAATAGA . CTTAGGTGCA GTCACCTTTA CTAACTGCAC CTCTGTAGCC 2400 

AATGTCAGTA GTCCTTTATG TGCCCTAAAC GGGTCAGTCT TCCTCTGTGG AAATAACATG -2460 

GCATACACCT ATTTACCCCA AAACTGGACC AGACTTTGCG TCCAAGCCTC CCTCCTCCCC 2520 

GACATTGACA T.CAACCCGGG GGATGAGCCA GTCCCCATTC CTGCCATTGA TCATTATATA 2580 

CATAGACCTA AACGAGCTGT ACAGTTCATC CCTTTACTAG CTGGACTGGG AATCACCGCA 2640 

GCATTCACCA CCGGAGCTAC AGGCCTAGGT GTCTCCGTCA CCCAGTATAC AAAATTATCC 2700 

CATCAGTTAA TATCTGATGT CCAAGTCTTA TCCGGTACCA TACAAGATTT ACAAGACCAG * 2'760 

GTAGACTCGT TAGCTGAAGT AGTTCTCCAA AATAGGAGGG GACTGGACCT ACTAACGGCA 2 820 

GAACAAGGAG GAATTTGTTT AGCCTTACAA GAAAAATGCT GTTTTTATGC TAACAAGTCA 2 880 

GGAATTGTGA GAAACAAAAT AAGAACCCTA CAAGAAGAAT TACAAAAACG CAGGGAAAGC 2940 

CTGGCAACCA ACCCTCTCTG GACCGGGCTG CAGGGCTTTC TTCCGTACCT CCTACCTCTC 3000 

CTGGGACCCC TACTCACCCT CCTACTCATA CTAACCATTG GGCCATGCGT TTTCAGTCGC 3060 

CTCATGGCCT TCATTAATGA TAGACTTAAT GTTGTACATG CCATGGTGCT GGCCCAGCAA 3120 

TACCAAGCAC TCAAAGCTGA GGAAGAAGCT CAGGATTGAG GCGCCTAGTG TTGACAATTA 3180 

ATCATCGGCA TAGTATACGG CATAGTATAA TACGACTCAC TATAGGAGGG GCACCATGGC 3240 

CAAGTTGACC AGTGCCGTTC CGGTGCTGAC CGCGCGCGAC GTCGCCGGAG CGGTCGAGTT 3 300 

CTGGACCGAC CGGCTCGGGT TCTCCCGGGA CTTCGTGGAG GACGACTTCG CCGGTGTGGT 3 3 60 

CCGGGACGAC GTGACCCTGT TCATCAGCGC GGTCCAGGAC CAGGTGGTGC CGGACAACAC 3420 

CCTGGCCTGG GTGTGGGTGC GCGGCCTGGA CGAGCTGTAC GCCGAGTGGT CGGAGGTCGT 3480 

GTCCACGAAC TTCCGGGACG CCTCCGGGCC GGCCATGACC GAGATCGGCG' AGCAGCCGTG 3 540 

GGGGCGGGAG TTCGCCCTGC GCGACCCGGC CGGCAACTGC GTGCACTTCG TGGCCGAGGA 3 600 

GCAGGACTGA NN3SINCGGACC GGTCGACTTG TTAACTTGTT TATTGCAGCT TATAATGGTT 3660 

ACAAATAAAG CAATAGCATC ACAAATTTCA CAAATAAAGC ATTTTTTTCA CTGCATTCTA 3720 

GTTGTGGTTT GTCCAAACTC ATCAATGTAT CTTATCATGT CTGGATCCAG ATCTGGGCCC 3780 

ATGCGGCCGC GGATCGATNN NNACATGTGA GCAAAAGGCC AGCAAAAGGC CAGGAACCGT 3 840 
AAAAAGGCCG CGTTGCTGGC GTTTTTCCAT AGGCTCCGCC CCCCTGACGA GCATCACAAA ' 3900 

AATCGACGCT CAAGTCAGAG GTGGCGAAAC CCGACAGGAC TATAAAGATA CCAGGCGTTT 3 960 

CCCCCTGGAA GCTCCCTCGT GCGCTCTCCT GTTCCGACCC TGCCGCTTAC CGGATACCTG 4020 

TCCGCCTTTC TCCCTTCGGG AAGCGTGGCG CTTTCTCAAT GCTCACGCTG TAGGTATCTC 4080 
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AGTTCGGTGT AGGTCGTTCG CTCCAAGCTG GGCTGTGTGC ACGAACCCCC CGTTCAGCCC 4140 
GACCGCTGCG CCTTATCCGG TAACTATCGT CTTGAGTCCA ACCCGGTAAG ACACGACTTA ' 4200 

TCGCCACTGG CAGCAGCCAC TGGTAACAGG ATTAGCAGAG CGAGGTATGT AGGCGGTGCT 4250 

ACAGAGTTCT TGAAGTGGTG GCCTAACTAC GGCTACACTA GAAGGACAGT ATTTGGTATC 4320 

TGCGCTCTGC TGAAGCCAGT TACCTTCGGA AAAAGAGTTG GTAGCTCTTG ATCCGGCAAA 4380 

CAAACCACCG CTGGTAGCGG TGGTTTTTTT GTTTGCAAGC AGCAGATTAC GCGCAGAAAA 4440 

AAAGGATCTC AAGAAGATCC TTTGATCTTT TCTACGGGGT CTGACGCTCA GTGGAACGAA 4500 

AACTCACGTT AAGGGATTTT GGTCATGAGA TTATCAAAAA GGATCTTCAC CTAGATCCTT 4560 

TTAAATTAAA AATGAAGTTT TAAATCAATC TAAAGTATAT ATGAGTAAAC TTGGTCTGAC 4620 

AGTTACCAAT GCTTAATCAG TGAGGCACCT ATCTCAGCGA TCTGTCTATT TCGTTCATCC 4680 

ATAGTTGCCT GACTCCCCGT CGTGTAGATA ACTACGATAC GGGAGGGCTT ACCATCTGGC 4740 

CCCAGTGCTG CAATGATACC GCGAGACCCA CGCTCACCGG CTCCAGATTT ATCAGCAATA 4800 

AACCAGCCAG CCGGAAGGGC CGAGCGCAGA AGTGGTCCTG CAACTTTATC CGCCTCCATC 4860 

CAGTCTATTA ATTGTTGCCG GGAAGCTAGA GTAAGTAGTT CGCCAGTTAA TAGTTTGCGC 4920 

AACGTTGTTG CCATTGCTAC AGGCATCGTG GTGTCACGCT CGTCGTTTGG TATGCCTTTCA 4980 

TTCAGCTCCG GTTCCCAACG ATCAAGGCGA GTTACATGAT CCCCCATGTT GTGCAAAAAA 5040 

GCGGTTAGCT CCTTCGGTCC TCCGATCGTT GTCAGAAGTA AGTTGGCCGC AGTGTTATCA 5100 

CTCATGGTTA TGGCAGCACT GCATAATTCT CTTACTGTCA TGCCATCCGT AAGATGCTTT 5160 

TCTGTGACTG GTGAGTACTC AACCAAGTCA TTCTGAGAAT AGTGTATGCG GCGACCGAGT 5220 

TGCTCTTGCC CGGCGTCAAT ACGGGATAAT ACCGCGCCAC ATAGCAGAAC TTTAAAAGTG 5280 

CTCATCATTG GAAAACGTTC TTCGGGGCGA AAACTCTCAA GGATCTTACC GCTGTTGAGA 5340 

TCCAGTTCGA TGTAACCCAC TCGTGCACCC AACTGATCTT CAGCATCTTT TACTTTCACC 5400 

AGCGTTTCTG GGTGAGCAAA AACAGGAAGG CAAAATGCCG CAAAAAAGGG AATAAGGGCG' 5460 

ACACGGAAAT GTTGAATACT CATACTCTTC CTTTTTCAAT ATTATTGAAG CATTTATCAG 5520 

GGTTATTGTC TCATGAGCGG ATACATATTT GAATGTATTT AGAAAAATAA ACAAATAGGG 5580 

GTTCCGCGCA CATTTCCCCG AAAAGTGCCA CCTGACGTCT AAGAAACCAT TATTATCATG 5640 

ACATTAACCT ATAAAAATAG GCGTATCACG AGGCCCTTT C " GTCTCGCGCG TTTCGGTGAT 5700 

GACGGTGAAA ACCTCTGACA CATGCAGCTC CCGGAGACGG TCACAGCTTG TCTGTAAGCG 5760 

GATGGCGGGA GCAGACAAGC CCGTCAGGGC GCGTCAGCGG GTGTTGGCGG GTGTCGGGGC 5820 

TGGCTTAACT ATGCGGCATC AGAGCAGATT GTACTGAGAG TGCAC 5865 
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Figure 13. hCMVlOAl Sequence 
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AGATCTCCCG ATCCCCTATG GTCGACTCTC 
AGCCAGTATC TGCTCCCTGC TTGTGTGTTG 
TAAGCTACAA CAAGGCAAGG CTTGACCGAC 
CGTTTTGCGC TGCTTCGCGA TGTACGGGCC 
AGTTATTAAT AGTAATCAAT TACGGGGTCA 
GTTACATAAC TTACGGTAAA TGGCCCGCCT 
ACGTCAATAA TGACGTATGT TCCCATAGTA 
TGGGTGGACT ATTTACGGTA AACTGCCCAC 
AGTACGCCCC CTATTGACGT CAATGACGGT 
ATGACCTTAT GGGACTTTCC TACTTGGCAG 
ATGGTGATGC GGTTTTGGCA GTACATCAAT 
TTTCCAAGTC TCCACCCCAT TGACGTCAAT 
GACTTTCCAA AATGTCGTAA CAACTCCGCC 
CGGTGGGAGG TCTATATAAG CAGAGCTCTC 
GCTTATCGAA ATGTCGACTG AGAACTTCAG 
CTTTTTCGCT ATTGTAAAAT TCATGTTATA 
TAGAATGGGA AGATGTCCCT TGTATCACCA 
CTTTCTACTC TGTTGACAAC CATTGTCTCC 
TTCGTTAAAC TTTAGCTTGC ATTTGTAACG 
AGATTGTAAG TACTTTCTCT AATCACTTTT 
TACTTCAGCA CAGTTTTAGA GAACAATTGT 
GCATATAAAT TCTGGCTGGC GTGGAAATAT 
CATCATCCTG CCTTTCTCTT TATGGTTACA 
ATACTCTGAG TCCAAACCGG GCCCCTCTGC 
ACAGCTCCTG GGCAACGTGC TGGTTGTTGT 
GGAACAGCAT CAGGACCGAC ATGGAAGGTC 
TTAACCCGTG GAAGTCCTTA ATGGTCATGG 
GCCCCCATCA GGTCTTTAAT GTAACCTGGA 
CCAATGCCAC CTCCCTTTTA 'GGAACTGTAC 
TATGTGATCT. GGTCGGAGAA GAGTGGGACC 
GCTGCAAATA CCCCGGAGGG AGAAAGCGGA 
GGCATACCGT AAAATCGGGG TGTGGGGGGC 
GTGAAACCAC CGGACAGGCT TACTGGAAGC 
AGCGCGGTAA CACCCCCTGG GACACGGGAT 
ACCTCTCCAA AGTATCCAAT TCCTTCCAAG 
TAGTCCTAGA ATTCACTGAT GCAGGAAAAA 
GACTGAGACT GTACCGGACA GGAACAGATC 
TCCTCAATAT AGGGCCCCGC ATCCCCATTG 
CCCCCTCCCG ACCCGTGCAG- ATCAGGCTCC 
CAGCCTCTAT AGTCCCTGAG ACTGCCCCAC 
TGCTAAACCT GGTAGAAGGA GCCTATCAGG 
AAGAATGTTG GCTGTGCTTA GTGTCGGGAC 
GCACTTATAC CAATCATTCT ACCGCCCCGG 
TTACCCTATC TGAAGTGACA GGACAGGGCC 
AGGCCTTATG TAACACCACC CAAAGTGCCG 
CTGGAACAAT GTGGGCTTGT" AGCACTGGAT 
ATCTAACCAC AGACTATTGT GTATTAGTTG 
CCGATTATAT GTATGGTCAG CTTGAACAGC 
TGACCCTGGC CCTTCTGCTA GGAGGATTAA 
CGGGGACCAC TGCCCTAATC AAAACCCAGC 
CAGACCTCAA CGAAGTCGAA AAATCAATTA 
CTGAAGTAGT CCTACAGAAC CGAAGAGGCC 
TCTGCGCAGC CCTAAAAGAA GAATGTTGTT 
ACAGCATGGC CAAACTAAGG GAAAGGCTTA 
AAGGTTGGTT CGAAGGGCAG TTTAATAGAT 
TCATGGGACC TCTAATAGTA CTCTTACTGA 
GATTAGTTCA ATTTGTTAAA GACAGGATCT 
AATACCACCA GCTAAAGCCT ATAGAGTACG 
TCATCGGCAT AGTATACGGC ATAGTATAAT 
AAGTTGACCA GTGCCGTTCC GGTGCTCACC 
TGGACCGACC GGCTCGGGTT CTCCCGGGAC 
CGGGACGACG TGACCCTGTT CATCAGCGCG 
CTGGCCTGGG TGTGGGTGCG CGGCCTGGAC 
TCCACGAACT TCCGGGACGC CTCCGGGCCG 
GGGCGGGAGT TCGCCCTGCG CGACCCGGCC 
CAGGACTGAN NNNCGGACCG GTCGA 



AGTACAATCT GCTCTGATGC 
GAGGTCGCTG AGTAGTGCGC 
AATTGCATGA AGAATCTGCT 
AGATATACGC GTTGACATTG 
TTAGTTCATA GCCCATATAT 
GGCTGACCGC CCAACGACCC 
ACGCCAATAG GGACTTTCCA 
TTGGCAGTAC ATCAAGTGTA 
AAATGGCCCG CCTGGCATTA 
TACATCTACG TATTAGTCAT 
GGGCGTGGAT AGCGGTTTGA 
GGGAGTTTGT TTTGGCACCA 
CCATTGACGC AAATGGGCGG 
TGGCTAACTA GAGAACCCAC 
GGTGAGTTTG GGGACCCTTG 
TGGAGGGGGC AAAGTTTTCA 
TGGACCCTCA TGATAATTTT 
TCTTATTTTC TTTTCATTTT 
AATTTTTAAA TTCACTTTTG 
TTTTCAAGGC AATCAGGGTA 
TATAATTAAA TGATAAGGTA 
TCTTATTGGT AGAAACAACT 
ATGATATACA CTGTTTGAGA 
TAACCATGTT CATGCCTTCT 
GCTGTCTCAT CATTTTGGCA 
CAGCGTTCTC AAAACCCCTT 
GGGTCTATTT AAGAGTAGGG 
GAGTCACCAA CCTGATGACT 
AAGATGCCTT CCCAAGATTA 
CTTCAGACCA GGAACCATAT 
CCCGGACTTT TGACTTTTAC 
CAAGAGAGGG CTACTGTGGT 
CCACATCATC ATGGGACCTA 
GCTCCAAAAT GGCTTGTGGC 
GGGCTACTCG AGGGGGCAGA 
AGGCTAATTG GGACGGGCCC 
CTATTACCAT GTTCTCCCTG 
GGCCTAATCC CGTGATCACT 
CCAGGCCTCC TCAGCCTCCT 
CTTCTCAACA ACCTGGGACG 
CGCTTAACCT CACCAATCCC 
CTCCTTATTA CGAAGGAGTA 
CCAGCTGTAC GGCCACTTCC 
TATGCATGGG AGCACTACCT 
GCTCAGGATC CTACTACCTT 
TGACTCCCTG CTTGTCCACC 
AGCTCTGGCC CAGAATAATT 
GTACCAAATA TAAGAGGGAG 
CCATGGGAGG GATTGCAGCT 
AGTTTGAGCA GCTTCACGCC 
CCAACCTAGA AAAGTCACTG 
TAGATTTGCT CTTCCTAAAA 
TTTATGCAGA CCACACGGGA 
ATCAGAGACA AAAACTATTT 
CCCCCTGGTT TACCACCTTA 
TCTTACTCTT TGGACCCTGC 
CAGTAGTCCA GGCTTTAGTC 
AGCCATAGGG CGCCTAGTGT 
ACGACTCACT ATAGGAGGGC 
GCGCGCGACG TCGCCGGAGC 
TTCGTGGAGG ACGACTTCGC 
GTCCAGGACC AGGTGGTGCC 
GAGCTGTACG CCGAGTGGTC 
GCCATGACCG AGATCGGCGA 
GGCAACTGCG TGCACTTCGT 
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